SlideShare a Scribd company logo
1 of 32
Marketing Analytics Case Study
Modelling in Operation Management
A method for discovering clusters of e-commerce interest
patterns using clickstream data
Qiang Su, Lu Chen
A method for discovering
clusters of e-commerce interest
patterns
using clickstream data
Qiang Su, Lu Chen ⇑
School of Economics & Management, Tongji University, Shanghai
200092, China
Objective
Derive interest patterns that can provide significant assistances
on web page optimization and personalized recommendation.
In this paper, with the aim of finding users’ interests based on
their click-stream data, a novel approach to discovering user
interests is developed and studied systematically.
What is Clickstream ??
The record of an internet user’s actions online has come to be known as click-
stream data. Click-stream data captures a wide variety of information in a
complete, timely and accurate manner. This data covers user activities such as
browsing paths, purchased products and clicked banner ads.
1. Introduction
To attract more customers, e-commerce companies are continuously diversifying
their products and increasing their category range. Large e-commerce
organizations frequently see more than a million customers per day log on to their
websites. Those potential customers view hundreds of thousands of catalog items
everyday. As a result, a specific challenge arises for these e-commerce
companies; namely how to discover the website users’ interests and promote
sales by effectively managing an ever-increasing number of categories and
products.
2. Category
structure and user
interest
measurement
2.1. Category structure and assumptions relating to
a user’s interest
In order to attract more customers, e-
business companies have been
expanding to provide more and more
products on their websites. Normally, an
e-commerce website, such as
Amazon.com, presents its product
information using a hierarchical tree
structure, as shown in fig.
2.1. Category structure and
assumptions relating to a user’s
interest
Every user has a
preference when he visits
a website. Users only
visit products that interest
them. Users with similar
interests show similar
browsing behaviors.
The frequency of visits to
the product pages is
positive correlate to
users’ interests.
The duration for which
users stay on a product
webpage is positive
correlate to their
interests.
The sequence of visits to
a product webpage is
positive correlate to
users’ interests.
2.2. Indicators of a
user’s interest
1. Indicator 1 : category visiting path
Session: A session is defined as a period of
sustained web browsing or a sequence of page
viewings.
Browsing path: The browsing path Pi {url1 , url2 ,
url3 , . . . , urln} of useri is a sequence of web pages
browsed during a session. The length of Pi is
expressed as len(Pi) = n. Eg. the browsing path P1
{ctg1 ; item1
1 ; ctg1 ; item2
1 ; item1
2 }
Category visiting path: The category visiting path
CtgPi {ctg1 , ctg2 , ctg3 , . . . , ctgm } is a sequence of
categories visited by user i during any given session.
The length of CtgPi is m, where m<=n. Eg: the
category visiting path of user1 is CtgPi {ctg1 , ctg2 }
according to his browsing path P1 {ctg1 ; item1
1 ; ctg1
; item2
1 ; item1
2 }.
2.2. Indicators of a
user’s interest
Indicator 2: visiting frequency
Hits: For a website, hitsi
j is utilized to count the number of
visits (or hits) to category ctgj by useri in a session. It
consist of two parts: the number of visits to ctgj and the
length of the visits to product items which belong to ctgj .
Hitsi
j = count(useri, ctgj ) Σk=1
l count(useri, itemk
j)
…(1)
Visiting frequency: Visiting frequency is defined as the
ratio of category hits to the length of the user’s browsing
path. ….(2)
Category-user associated matrix with frequency
characteristics: A category-user associated matrix with
frequency characteristic Rt*s
freq is used to describe the
relationship between categories and users in terms of
visiting frequency.
2.2. Indicators of a
user’s interest
Duration: we use durationi
j to denote the total time that
useri spends browsing category ctgj in a session.
…(3)
Relative duration: Relative duration redui
j is defined as the
ratio of time useri spends on category ctgj to the time of the
entire session.
….(4)
Category-User associated matrix with time
characteristics: A category-user associated matrix with
time characteristic Rt*s
time is used to describe the
relationships between categories and users in terms of
3. Design of the rough leader
clustering algorithm
A rough leader clustering algorithm is developed using a leader clustering
algorithm leveraged by a rough set theory. The leader clustering calculation is
based on users’ similarities in terms of browsing behavior. At the same time,
rough set analysis makes it possible for a user to be assigned to more than one
cluster.
3.1 Measurements for
similarity
Given the indicators of browsing behaviors, i.e., category visiting path CtgPi , visiting frequency freqi
j and
relative duration redui
j , the similarity of behaviorscan be evaluated quantitatively. Thereby, users can
accordingly be assigned to different clusters.
Various functions have been developed to estimate the degree of similarity between two users, such as
the Pearson correlation coefficient, cosine similarity or distance measures. The choice of similarity
function should be made properly, based on the data set at hand.
Frequency similarity: The frequency similarity between two users user p and user q is defined as a
cosine similarity measure of two vectors p , q , which are row vectors extracted from matrix Rt*s
freq .
….(5)
3.1 Measurements for
similarity
Duration similarity: Similarly, the duration similarity between two users user p and user q is defined as
follows: ….(6)
Path similarity: The path similarity simpq(path) between two users user p and user q is defined as the
common path length divided by the maximal path length.
…..(7)
Given the above three similarities, the total similarity between two users user p
and user q is defined as follows:
simpq(path) = 𝜶*simpq(seq) + 𝜷*simpq(freq) + 𝛄*simpq(time) …..(8)
In which, 𝜶 , 𝜷, 𝛄 are used to adjust the weight of the three dimensions of
sequence, frequency, and duration (time). Also,
𝜶+𝜷+𝛄 = 1 , 0<=simpq<=1.
3.2 Cluster algorithm
The algorithm starts with a randomly selected object as the initial leader. For
each object in the data set, we calculate the similarity between the object and
the leader. If the similarity meets the predefined threshold, then the object is
assigned to the cluster represented by the leader; otherwise, the object will be
regarded as a new leader. These procedures will be repeated until all objects
are either assigned to a cluster or regarded as leaders.
Leader Clustering Algorithm:
Leader algorithm is a incremental clustering algorithm generally used to cluster
large data sets. This algorithm is order dependent and may form different
clusters based on the order the data set is provided to the algorithm.
The algorithm consists of the following steps.
Step 1: Assign the first data item, P1 to cluster C1. This data set will be the
leader of the cluster C1.
Step 2:Now move to the next data item say P2 and calculate its distance from
the leader P1. If the distance between P2 and leader P1 is less than a user
specified threshold (t) then data point P2 is assigned to this cluster (Cluster C1).
If the distance between leader P1 and data item P2 is more than the user
specified threshold t, then form a new cluster C2 and assign P2 to this new
cluster. P2 will be the leader of the cluster C2.
Step3: For all the remaining data items the distance between the data point and
the leader of the clusters is calculated. If the distance between the data items
and the any of the leader is less then the user specified threshold, the data point
is assigned to that cluster. However, If the distance between the data point and
the any of the cluster's leader is more than the user specified threshold, a new
cluster is created and that particular data point is assigned to that cluster and
considered the leader of the cluster.
Step 4: Repeat Step 3 till all the data items are assigned to clusters
Let’s do it with an example :
A (1, 1),B(1, 2), C(2, 2), D(6, 2), E(7, 2), F(6, 6), G(7, 6)
Rough set property
Property 1: An object can be a member of at most one lower bound. This is
consistent with the definition of lower approximation, where members belong to
only one cluster.
Property 2: The lower bound is contained in the upper bound. An object that is
a member of the lower approximation of a cluster is also a member of the upper
approximation of the same cluster.
Property 3: An object cannot belong to only a single upper bound region. An
object that does not belong to any lower approximation must be a member of at
least two upper approximations.
General threshold Ꞇ(tau) : General threshold s is utilized to control whether or
not a user should be assigned to an existing cluster.
Rough threshold ເ: The overlap among clusters is controlled by rough
threshold f.
Preprocess and Filtering outliers of
data:
In order to improve data quality
● We eliminate a number of unvalued categories and users
● We discard categories and irrelevant records which are very rarely
visited by users.
● We also omit certain invalid customers who visit very few
categories.
● The time spent on a webpage is estimated as the time interval
between two adjacent page requests. In this case, an
extraordinarily long period of time will be considered an outlier
To simplify the calculation, the dataset is evenly divided by a fixed interval,
which ensures the data still covers the full day and thereby ensuring the results
will not be affected by time.
As a result, the experimental dataset contains 10,000 users and 1810
categories. Although the user number in the experimental dataset is only about
one-third of that in the whole dataset, the number of categories is decreased
only very slightly, from 1823 to 1810.
Pseudo Code
Discussion
Comparison with other algorithms As is already well accepted, a good
clustering algorithm should be able to search out clusters with the shortest
distances between members of a single cluster, and with the longest distances
between members of different clusters. From this perspective, Dunn’s index and
Davies–Bouldin’s index are usually employed to compare clustering results.
A smaller D(U) value and a larger DB(U) value mean that derived clusters are
compact, and their centers are far away from each other.
Discussion
General research
route for
clustering
interest patterns
Conclusion
The experiment’s findings demonstrate that the newly-proposed approach
can outperform the traditional leader clustering algorithm in terms of
effectiveness and efficiency. Using this approach, three main interest
patterns are discovered, and the stability of the three patterns is tested.
These interest patterns can provide significant assistance to online retailers
who wish to personalize their website’s layout and navigation structure.
More importantly, these interest patterns can improve the site’s
recommendation strategies and make the site more effective.
Marketing analysis

More Related Content

What's hot

IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...ijcseit
 
User Priority Based Search on Organizing User Search Histories with Security
User Priority Based Search on Organizing User Search Histories with SecurityUser Priority Based Search on Organizing User Search Histories with Security
User Priority Based Search on Organizing User Search Histories with SecurityIOSR Journals
 
Practical Knowledge Representation
Practical Knowledge RepresentationPractical Knowledge Representation
Practical Knowledge Representationbutest
 
Integrating vague association mining with markov model
Integrating vague association mining with markov modelIntegrating vague association mining with markov model
Integrating vague association mining with markov modelijsc
 
S1240068 presentation
S1240068 presentationS1240068 presentation
S1240068 presentationKuriharaYuta1
 
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...ijscai
 
Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...IJCNCJournal
 
Selection of optimal hyper-parameter values of support vector machine for sen...
Selection of optimal hyper-parameter values of support vector machine for sen...Selection of optimal hyper-parameter values of support vector machine for sen...
Selection of optimal hyper-parameter values of support vector machine for sen...journalBEEI
 
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...IJECEIAES
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Alexander Decker
 
Traffic Classification using a Statistical Approach
Traffic Classification using a Statistical ApproachTraffic Classification using a Statistical Approach
Traffic Classification using a Statistical ApproachDenis Zuev
 
ICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data WarehousingICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data WarehousingTakuma Wakamori
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learningsun peiyuan
 
Clustering Using Shared Reference Points Algorithm Based On a Sound Data Model
Clustering Using Shared Reference Points Algorithm Based On a Sound Data ModelClustering Using Shared Reference Points Algorithm Based On a Sound Data Model
Clustering Using Shared Reference Points Algorithm Based On a Sound Data ModelWaqas Tariq
 
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGA SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGijseajournal
 

What's hot (17)

IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
 
User Priority Based Search on Organizing User Search Histories with Security
User Priority Based Search on Organizing User Search Histories with SecurityUser Priority Based Search on Organizing User Search Histories with Security
User Priority Based Search on Organizing User Search Histories with Security
 
Practical Knowledge Representation
Practical Knowledge RepresentationPractical Knowledge Representation
Practical Knowledge Representation
 
Integrating vague association mining with markov model
Integrating vague association mining with markov modelIntegrating vague association mining with markov model
Integrating vague association mining with markov model
 
S1240068 presentation
S1240068 presentationS1240068 presentation
S1240068 presentation
 
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WE...
 
Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...Probability Density Functions of the Packet Length for Computer Networks With...
Probability Density Functions of the Packet Length for Computer Networks With...
 
Selection of optimal hyper-parameter values of support vector machine for sen...
Selection of optimal hyper-parameter values of support vector machine for sen...Selection of optimal hyper-parameter values of support vector machine for sen...
Selection of optimal hyper-parameter values of support vector machine for sen...
 
Notes On Networking 1
Notes On Networking 1Notes On Networking 1
Notes On Networking 1
 
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...
Ternary Tree Based Approach For Accessing the Resources by Overlapping Member...
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
 
Traffic Classification using a Statistical Approach
Traffic Classification using a Statistical ApproachTraffic Classification using a Statistical Approach
Traffic Classification using a Statistical Approach
 
ICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data WarehousingICDE2014 Session 14 Data Warehousing
ICDE2014 Session 14 Data Warehousing
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Clustering Using Shared Reference Points Algorithm Based On a Sound Data Model
Clustering Using Shared Reference Points Algorithm Based On a Sound Data ModelClustering Using Shared Reference Points Algorithm Based On a Sound Data Model
Clustering Using Shared Reference Points Algorithm Based On a Sound Data Model
 
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGA SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
 
F0313539
F0313539F0313539
F0313539
 

Similar to Marketing analysis

Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service iiKan-Han (John) Lu
 
A Novel Approach for User Search Results Using Feedback Sessions
A Novel Approach for User Search Results Using Feedback  SessionsA Novel Approach for User Search Results Using Feedback  Sessions
A Novel Approach for User Search Results Using Feedback SessionsIJMER
 
Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextIRJET Journal
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET Journal
 
Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Editor IJARCET
 
Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Editor IJARCET
 
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBCOMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBIJMIT JOURNAL
 
Data Visualization and Communication by Big Data
Data Visualization and Communication by Big DataData Visualization and Communication by Big Data
Data Visualization and Communication by Big DataIRJET Journal
 
Effectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniquesEffectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniqueseSAT Publishing House
 
Effectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniquesEffectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniqueseSAT Journals
 
Hybrid-e-greedy for mobile context-aware recommender system
Hybrid-e-greedy for mobile context-aware recommender systemHybrid-e-greedy for mobile context-aware recommender system
Hybrid-e-greedy for mobile context-aware recommender systemBouneffouf Djallel
 
Graph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in TwitterGraph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in Twitterraghavr186
 
Paper id 212014111
Paper id 212014111Paper id 212014111
Paper id 212014111IJRAT
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONijwscjournal
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)es712
 
cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02PRIYANKA MEHTA
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...IRJET Journal
 
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...IEEEFINALYEARSTUDENTPROJECT
 

Similar to Marketing analysis (20)

Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
 
A Novel Approach for User Search Results Using Feedback Sessions
A Novel Approach for User Search Results Using Feedback  SessionsA Novel Approach for User Search Results Using Feedback  Sessions
A Novel Approach for User Search Results Using Feedback Sessions
 
Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network Context
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using Cobweb
 
Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959
 
Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959Volume 2-issue-6-1955-1959
Volume 2-issue-6-1955-1959
 
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBCOMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
 
Data Visualization and Communication by Big Data
Data Visualization and Communication by Big DataData Visualization and Communication by Big Data
Data Visualization and Communication by Big Data
 
Effectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniquesEffectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniques
 
Effectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniquesEffectual citizen relationship management with data mining techniques
Effectual citizen relationship management with data mining techniques
 
Hybrid-e-greedy for mobile context-aware recommender system
Hybrid-e-greedy for mobile context-aware recommender systemHybrid-e-greedy for mobile context-aware recommender system
Hybrid-e-greedy for mobile context-aware recommender system
 
Graph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in TwitterGraph Based User Interest Modeling in Twitter
Graph Based User Interest Modeling in Twitter
 
Paper id 212014111
Paper id 212014111Paper id 212014111
Paper id 212014111
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
 
cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02cloudworkloadanalysisandsimulation-140521153543-phpapp02
cloudworkloadanalysisandsimulation-140521153543-phpapp02
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
rscript_paper-1
rscript_paper-1rscript_paper-1
rscript_paper-1
 
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
 
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...
2014 IEEE JAVA DATA MINING PROJECT Discovering emerging topics in social stre...
 

More from Gaurav Dubey

Microsoft exam (2)
Microsoft exam (2)Microsoft exam (2)
Microsoft exam (2)Gaurav Dubey
 
Health care analytics
Health care analyticsHealth care analytics
Health care analyticsGaurav Dubey
 
Banking case study
Banking case studyBanking case study
Banking case studyGaurav Dubey
 
Presentation of knapsack
Presentation of knapsackPresentation of knapsack
Presentation of knapsackGaurav Dubey
 
Value thinking. gaurav
Value thinking. gauravValue thinking. gaurav
Value thinking. gauravGaurav Dubey
 

More from Gaurav Dubey (6)

Microsoft exam (2)
Microsoft exam (2)Microsoft exam (2)
Microsoft exam (2)
 
Health care analytics
Health care analyticsHealth care analytics
Health care analytics
 
Banking case study
Banking case studyBanking case study
Banking case study
 
Healthcare
HealthcareHealthcare
Healthcare
 
Presentation of knapsack
Presentation of knapsackPresentation of knapsack
Presentation of knapsack
 
Value thinking. gaurav
Value thinking. gauravValue thinking. gaurav
Value thinking. gaurav
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 

Marketing analysis

  • 1. Marketing Analytics Case Study Modelling in Operation Management
  • 2. A method for discovering clusters of e-commerce interest patterns using clickstream data Qiang Su, Lu Chen A method for discovering clusters of e-commerce interest patterns using clickstream data Qiang Su, Lu Chen ⇑ School of Economics & Management, Tongji University, Shanghai 200092, China
  • 3. Objective Derive interest patterns that can provide significant assistances on web page optimization and personalized recommendation.
  • 4. In this paper, with the aim of finding users’ interests based on their click-stream data, a novel approach to discovering user interests is developed and studied systematically.
  • 5. What is Clickstream ?? The record of an internet user’s actions online has come to be known as click- stream data. Click-stream data captures a wide variety of information in a complete, timely and accurate manner. This data covers user activities such as browsing paths, purchased products and clicked banner ads.
  • 6. 1. Introduction To attract more customers, e-commerce companies are continuously diversifying their products and increasing their category range. Large e-commerce organizations frequently see more than a million customers per day log on to their websites. Those potential customers view hundreds of thousands of catalog items everyday. As a result, a specific challenge arises for these e-commerce companies; namely how to discover the website users’ interests and promote sales by effectively managing an ever-increasing number of categories and products.
  • 7. 2. Category structure and user interest measurement
  • 8. 2.1. Category structure and assumptions relating to a user’s interest In order to attract more customers, e- business companies have been expanding to provide more and more products on their websites. Normally, an e-commerce website, such as Amazon.com, presents its product information using a hierarchical tree structure, as shown in fig.
  • 9. 2.1. Category structure and assumptions relating to a user’s interest Every user has a preference when he visits a website. Users only visit products that interest them. Users with similar interests show similar browsing behaviors. The frequency of visits to the product pages is positive correlate to users’ interests. The duration for which users stay on a product webpage is positive correlate to their interests. The sequence of visits to a product webpage is positive correlate to users’ interests.
  • 10. 2.2. Indicators of a user’s interest 1. Indicator 1 : category visiting path Session: A session is defined as a period of sustained web browsing or a sequence of page viewings. Browsing path: The browsing path Pi {url1 , url2 , url3 , . . . , urln} of useri is a sequence of web pages browsed during a session. The length of Pi is expressed as len(Pi) = n. Eg. the browsing path P1 {ctg1 ; item1 1 ; ctg1 ; item2 1 ; item1 2 } Category visiting path: The category visiting path CtgPi {ctg1 , ctg2 , ctg3 , . . . , ctgm } is a sequence of categories visited by user i during any given session. The length of CtgPi is m, where m<=n. Eg: the category visiting path of user1 is CtgPi {ctg1 , ctg2 } according to his browsing path P1 {ctg1 ; item1 1 ; ctg1 ; item2 1 ; item1 2 }.
  • 11. 2.2. Indicators of a user’s interest Indicator 2: visiting frequency Hits: For a website, hitsi j is utilized to count the number of visits (or hits) to category ctgj by useri in a session. It consist of two parts: the number of visits to ctgj and the length of the visits to product items which belong to ctgj . Hitsi j = count(useri, ctgj ) Σk=1 l count(useri, itemk j) …(1) Visiting frequency: Visiting frequency is defined as the ratio of category hits to the length of the user’s browsing path. ….(2) Category-user associated matrix with frequency characteristics: A category-user associated matrix with frequency characteristic Rt*s freq is used to describe the relationship between categories and users in terms of visiting frequency.
  • 12. 2.2. Indicators of a user’s interest Duration: we use durationi j to denote the total time that useri spends browsing category ctgj in a session. …(3) Relative duration: Relative duration redui j is defined as the ratio of time useri spends on category ctgj to the time of the entire session. ….(4) Category-User associated matrix with time characteristics: A category-user associated matrix with time characteristic Rt*s time is used to describe the relationships between categories and users in terms of
  • 13. 3. Design of the rough leader clustering algorithm A rough leader clustering algorithm is developed using a leader clustering algorithm leveraged by a rough set theory. The leader clustering calculation is based on users’ similarities in terms of browsing behavior. At the same time, rough set analysis makes it possible for a user to be assigned to more than one cluster.
  • 14. 3.1 Measurements for similarity Given the indicators of browsing behaviors, i.e., category visiting path CtgPi , visiting frequency freqi j and relative duration redui j , the similarity of behaviorscan be evaluated quantitatively. Thereby, users can accordingly be assigned to different clusters. Various functions have been developed to estimate the degree of similarity between two users, such as the Pearson correlation coefficient, cosine similarity or distance measures. The choice of similarity function should be made properly, based on the data set at hand. Frequency similarity: The frequency similarity between two users user p and user q is defined as a cosine similarity measure of two vectors p , q , which are row vectors extracted from matrix Rt*s freq . ….(5)
  • 15. 3.1 Measurements for similarity Duration similarity: Similarly, the duration similarity between two users user p and user q is defined as follows: ….(6) Path similarity: The path similarity simpq(path) between two users user p and user q is defined as the common path length divided by the maximal path length. …..(7)
  • 16. Given the above three similarities, the total similarity between two users user p and user q is defined as follows: simpq(path) = 𝜶*simpq(seq) + 𝜷*simpq(freq) + 𝛄*simpq(time) …..(8) In which, 𝜶 , 𝜷, 𝛄 are used to adjust the weight of the three dimensions of sequence, frequency, and duration (time). Also, 𝜶+𝜷+𝛄 = 1 , 0<=simpq<=1.
  • 17. 3.2 Cluster algorithm The algorithm starts with a randomly selected object as the initial leader. For each object in the data set, we calculate the similarity between the object and the leader. If the similarity meets the predefined threshold, then the object is assigned to the cluster represented by the leader; otherwise, the object will be regarded as a new leader. These procedures will be repeated until all objects are either assigned to a cluster or regarded as leaders.
  • 18. Leader Clustering Algorithm: Leader algorithm is a incremental clustering algorithm generally used to cluster large data sets. This algorithm is order dependent and may form different clusters based on the order the data set is provided to the algorithm. The algorithm consists of the following steps.
  • 19. Step 1: Assign the first data item, P1 to cluster C1. This data set will be the leader of the cluster C1. Step 2:Now move to the next data item say P2 and calculate its distance from the leader P1. If the distance between P2 and leader P1 is less than a user specified threshold (t) then data point P2 is assigned to this cluster (Cluster C1). If the distance between leader P1 and data item P2 is more than the user specified threshold t, then form a new cluster C2 and assign P2 to this new cluster. P2 will be the leader of the cluster C2.
  • 20. Step3: For all the remaining data items the distance between the data point and the leader of the clusters is calculated. If the distance between the data items and the any of the leader is less then the user specified threshold, the data point is assigned to that cluster. However, If the distance between the data point and the any of the cluster's leader is more than the user specified threshold, a new cluster is created and that particular data point is assigned to that cluster and considered the leader of the cluster. Step 4: Repeat Step 3 till all the data items are assigned to clusters
  • 21. Let’s do it with an example : A (1, 1),B(1, 2), C(2, 2), D(6, 2), E(7, 2), F(6, 6), G(7, 6)
  • 22. Rough set property Property 1: An object can be a member of at most one lower bound. This is consistent with the definition of lower approximation, where members belong to only one cluster. Property 2: The lower bound is contained in the upper bound. An object that is a member of the lower approximation of a cluster is also a member of the upper approximation of the same cluster. Property 3: An object cannot belong to only a single upper bound region. An object that does not belong to any lower approximation must be a member of at least two upper approximations.
  • 23. General threshold Ꞇ(tau) : General threshold s is utilized to control whether or not a user should be assigned to an existing cluster. Rough threshold ເ: The overlap among clusters is controlled by rough threshold f.
  • 24. Preprocess and Filtering outliers of data: In order to improve data quality ● We eliminate a number of unvalued categories and users ● We discard categories and irrelevant records which are very rarely visited by users. ● We also omit certain invalid customers who visit very few categories. ● The time spent on a webpage is estimated as the time interval between two adjacent page requests. In this case, an extraordinarily long period of time will be considered an outlier
  • 25. To simplify the calculation, the dataset is evenly divided by a fixed interval, which ensures the data still covers the full day and thereby ensuring the results will not be affected by time. As a result, the experimental dataset contains 10,000 users and 1810 categories. Although the user number in the experimental dataset is only about one-third of that in the whole dataset, the number of categories is decreased only very slightly, from 1823 to 1810.
  • 26.
  • 28. Discussion Comparison with other algorithms As is already well accepted, a good clustering algorithm should be able to search out clusters with the shortest distances between members of a single cluster, and with the longest distances between members of different clusters. From this perspective, Dunn’s index and Davies–Bouldin’s index are usually employed to compare clustering results. A smaller D(U) value and a larger DB(U) value mean that derived clusters are compact, and their centers are far away from each other.
  • 31. Conclusion The experiment’s findings demonstrate that the newly-proposed approach can outperform the traditional leader clustering algorithm in terms of effectiveness and efficiency. Using this approach, three main interest patterns are discovered, and the stability of the three patterns is tested. These interest patterns can provide significant assistance to online retailers who wish to personalize their website’s layout and navigation structure. More importantly, these interest patterns can improve the site’s recommendation strategies and make the site more effective.

Editor's Notes

  1. Thanks to the development of information technology, the internet allows for the real-time, low cost and unobtrusive collection of detailed information regarding individuals’ activities. The record of an internet user’s actions online has come to be known as click-stream data. Click-stream data captures a wide variety of information in a complete, timely and accurate manner. This data covers user activities such as browsing paths, purchased products and clicked banner ads. Click-stream data is becoming one of the most useful resources for researchers and practitioners attempting to understand individuals’ behaviors in terms of choice