SlideShare a Scribd company logo
1 of 15
Profiler for Smartphone Users Interests Using
Modified Hierarchical Agglomerative Clustering
Algorithm Based on Browsing History
Priagung Khusumanegara, Rischan Mafrur, and Deokjai Choi
School of Electronics & Computer Engineering
Chonnam National University
{priagung.123, rischanlab}@gmail.com, dchoi@jnu.ac.kr
The 23rd IFIP World Computer Congress (WCC 2015)
Tuesday, October 6
Outline
• Introduction
• Proposed System
• Data Collection
• Data Extraction
• URL Filtering Categories
• Modified Hierarchical Agglomerative Clustering
• Experimental Results
Introduction
Motivations:
• Smartphone provides many applications to support our activity which one of
the applications is web browser applications.
• People spend much time on browsing activity for finding useful information
that they are interested on it.
• It is not easy to find the particular pieces of information that they interested
on it.
Proposed System:
• We proposed a Modified Hierarchical Agglomerative Clustering
Algorithms on a server-based application to provides smartphone users
profiling for interests-focused based on browsing history.
Proposed System
Figure: Illustration of Proposed System
Data Collection
• Data Collection Application
– It is built using the Funf framework. Funf framework is an
extensible sensing and data processing framework for
mobile devices.
• Number of participants
– 30 Chonnam National University Students (aged 19-22)
• Period of data collection
– 1 month (July 2014)
Figure: Data Collection Application
Data Extraction
• Example of Collected Data
• We extract collected data to observe a resource name
part of URL structure which is useful information to
analyze user interest.
• Example of data extraction
URL Filtering Categories
• URL Labeling
The URL data is labeled based on a popular Web directory.
Open Directory Project (ODP) (dmoz.org).
• URL Filtering Categories
• The matrix of filtering categories
Category Group Category Type
Business Business/Economy, Job Search/Careers, Real Estate, and Shopping
Communications and search Blog/Web Communication, Social Networks, Email, and Search Engines
General Computer/Internet, Education, News/Media, and Reference
Lifestyle
Entertainment, Games, Arts, Humor, Religion, Restaurants/Food, and Tra
vel
𝐶4𝑥𝑛 =
𝑘𝑒𝑦1,1 𝑘𝑒𝑦1,2 … 𝑘𝑒𝑦1,𝑛
𝑘𝑒𝑦2,1 𝑘𝑒𝑦2,2 … 𝑘𝑒𝑦2,𝑛
𝑘𝑒𝑦3,1
𝑘𝑒𝑦4,1
𝑘𝑒𝑦3,2
𝑘𝑒𝑦4,2
…
…
𝑘𝑒𝑦3,𝑛
𝑘𝑒𝑦4,𝑛
Modified Hierarchical Agglomerative
Clustering
𝑚𝑖𝑛 1 𝑚𝑖𝑛 2 𝑔𝑟𝑜𝑢𝑝 1
16: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦2,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛:
17: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 2
18: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦3,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛:
19: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 3
20: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦4,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛:
21: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖 𝑛2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 4
22: 𝑙 = 𝑙 + 1
23: 𝒆𝒏𝒅 𝒘𝒉𝒊𝒍𝒆
24: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 1
=
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 1
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖
4
𝑖=1
25: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 2
=
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 2
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖
4
𝑖=1
26: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 3
=
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 3
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖
4
𝑖=1
27: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 4
=
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 4
𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖
4
𝑖=1
28: set 𝐶𝑔𝑟𝑜𝑢𝑝 𝑖
which has max 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑎𝑠 𝑢𝑠𝑒𝑟 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡𝑠
Experimental Results
Figure: Degree of Users Interests
Figure: Users Interests Profile
Experimental Results
(Cont’d)
• C 4.5 is the best known classification
algorithm used to generate decision trees
for continuous and discrete attributes.
• Attributes:
– Total session time
– Total time a user stays at the site
– Total number of accessed pages during the whole
session
Experimental Results (Cont’d)
Figure : Execution time comparison with C4.5 algorithm
Experimental Results (Cont’d)
Memory Size (MB) MHAC (%) C4.5 (%)
25 83.3 76.7
50 86.7 80.0
75 93.3 80.0
100 96.7 83.3
125 96.7 86.7
Table. Accuracy comparison with C4.5 algorithm
Conclusion and Futures Work
• We proposed a modified Hierarchical Agglomerative Clustering that can
automatically provides a user interests profile.
• The proposed algorithms can measure degree of users’ interests and
inferring particular pieces of information that they interested on it based on
browsing history
• Our work can outperforms the C4.5 algorithm in execution time and
accuracy on all memory utilization.
• In the future, we need to implement Map-Reduce algorithm on modified
Hierarchical Agglomerative Clustering to enhance performance of clustering
algorithm.
Acknowledgements
This research was supported by Basic Science
Research Program through the National Research
Foundation of Korea (NRF) funded by the Ministry
of Education (2012R1A1A2007014).
Thank You
감사합니다

More Related Content

Similar to Profiler for Smartphone Users Interests Using Modified Hierarchical Agglomerative Clustering Algorithm Based on Browsing History

[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with sparkMarissa Saunders
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Charalampos Chelmis
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmIOSR Journals
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksMarco Brambilla
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for MoviesIRJET Journal
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & associjerd
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEEMEMTECHSTUDENTPROJECTS
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousingVaishnavi
 
Federated Ontology Based Query System
Federated Ontology Based Query System Federated Ontology Based Query System
Federated Ontology Based Query System George Sam
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structureiosrjce
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
bookrecommendations-230615063942-3b1016c9 (1).pdf
bookrecommendations-230615063942-3b1016c9 (1).pdfbookrecommendations-230615063942-3b1016c9 (1).pdf
bookrecommendations-230615063942-3b1016c9 (1).pdf13DikshaDatir
 

Similar to Profiler for Smartphone Users Interests Using Modified Hierarchical Agglomerative Clustering Algorithm Based on Browsing History (20)

[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
cametrics-report-final
cametrics-report-finalcametrics-report-final
cametrics-report-final
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
 
H017124652
H017124652H017124652
H017124652
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient Algorithm
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Final .pptx
Final .pptxFinal .pptx
Final .pptx
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for Movies
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Federated Ontology Based Query System
Federated Ontology Based Query System Federated Ontology Based Query System
Federated Ontology Based Query System
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structure
 
G017334248
G017334248G017334248
G017334248
 
Analytics in Online Retail
Analytics in Online RetailAnalytics in Online Retail
Analytics in Online Retail
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
bookrecommendations-230615063942-3b1016c9 (1).pdf
bookrecommendations-230615063942-3b1016c9 (1).pdfbookrecommendations-230615063942-3b1016c9 (1).pdf
bookrecommendations-230615063942-3b1016c9 (1).pdf
 

More from Lippo Group Digital

Behavior-Based Authentication System Based on Smartphone Life-Logs Data
Behavior-Based Authentication System Based on Smartphone Life-Logs DataBehavior-Based Authentication System Based on Smartphone Life-Logs Data
Behavior-Based Authentication System Based on Smartphone Life-Logs DataLippo Group Digital
 
A web based iptv content syndication system for personalized content guide
A web based iptv content syndication system for personalized content guideA web based iptv content syndication system for personalized content guide
A web based iptv content syndication system for personalized content guideLippo Group Digital
 
Time-based DDoS Detection and Mitigation for SDN Controller
Time-based DDoS Detection and Mitigation for SDN ControllerTime-based DDoS Detection and Mitigation for SDN Controller
Time-based DDoS Detection and Mitigation for SDN ControllerLippo Group Digital
 
Caching in Information Centric Network (ICN)
Caching in Information Centric Network (ICN)Caching in Information Centric Network (ICN)
Caching in Information Centric Network (ICN)Lippo Group Digital
 
Analisa pengaruh block size pada hdfs terhadap kecepatan
Analisa pengaruh block size pada hdfs terhadap kecepatanAnalisa pengaruh block size pada hdfs terhadap kecepatan
Analisa pengaruh block size pada hdfs terhadap kecepatanLippo Group Digital
 

More from Lippo Group Digital (9)

Behavior-Based Authentication System Based on Smartphone Life-Logs Data
Behavior-Based Authentication System Based on Smartphone Life-Logs DataBehavior-Based Authentication System Based on Smartphone Life-Logs Data
Behavior-Based Authentication System Based on Smartphone Life-Logs Data
 
Domain specific IoT
Domain specific IoTDomain specific IoT
Domain specific IoT
 
Fall detection
Fall detectionFall detection
Fall detection
 
The future internet web 3.0
The future internet  web 3.0The future internet  web 3.0
The future internet web 3.0
 
A web based iptv content syndication system for personalized content guide
A web based iptv content syndication system for personalized content guideA web based iptv content syndication system for personalized content guide
A web based iptv content syndication system for personalized content guide
 
Time-based DDoS Detection and Mitigation for SDN Controller
Time-based DDoS Detection and Mitigation for SDN ControllerTime-based DDoS Detection and Mitigation for SDN Controller
Time-based DDoS Detection and Mitigation for SDN Controller
 
Caching in Information Centric Network (ICN)
Caching in Information Centric Network (ICN)Caching in Information Centric Network (ICN)
Caching in Information Centric Network (ICN)
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Analisa pengaruh block size pada hdfs terhadap kecepatan
Analisa pengaruh block size pada hdfs terhadap kecepatanAnalisa pengaruh block size pada hdfs terhadap kecepatan
Analisa pengaruh block size pada hdfs terhadap kecepatan
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

Profiler for Smartphone Users Interests Using Modified Hierarchical Agglomerative Clustering Algorithm Based on Browsing History

  • 1. Profiler for Smartphone Users Interests Using Modified Hierarchical Agglomerative Clustering Algorithm Based on Browsing History Priagung Khusumanegara, Rischan Mafrur, and Deokjai Choi School of Electronics & Computer Engineering Chonnam National University {priagung.123, rischanlab}@gmail.com, dchoi@jnu.ac.kr The 23rd IFIP World Computer Congress (WCC 2015) Tuesday, October 6
  • 2. Outline • Introduction • Proposed System • Data Collection • Data Extraction • URL Filtering Categories • Modified Hierarchical Agglomerative Clustering • Experimental Results
  • 3. Introduction Motivations: • Smartphone provides many applications to support our activity which one of the applications is web browser applications. • People spend much time on browsing activity for finding useful information that they are interested on it. • It is not easy to find the particular pieces of information that they interested on it. Proposed System: • We proposed a Modified Hierarchical Agglomerative Clustering Algorithms on a server-based application to provides smartphone users profiling for interests-focused based on browsing history.
  • 5. Data Collection • Data Collection Application – It is built using the Funf framework. Funf framework is an extensible sensing and data processing framework for mobile devices. • Number of participants – 30 Chonnam National University Students (aged 19-22) • Period of data collection – 1 month (July 2014) Figure: Data Collection Application
  • 6. Data Extraction • Example of Collected Data • We extract collected data to observe a resource name part of URL structure which is useful information to analyze user interest. • Example of data extraction
  • 7. URL Filtering Categories • URL Labeling The URL data is labeled based on a popular Web directory. Open Directory Project (ODP) (dmoz.org). • URL Filtering Categories • The matrix of filtering categories Category Group Category Type Business Business/Economy, Job Search/Careers, Real Estate, and Shopping Communications and search Blog/Web Communication, Social Networks, Email, and Search Engines General Computer/Internet, Education, News/Media, and Reference Lifestyle Entertainment, Games, Arts, Humor, Religion, Restaurants/Food, and Tra vel 𝐶4𝑥𝑛 = 𝑘𝑒𝑦1,1 𝑘𝑒𝑦1,2 … 𝑘𝑒𝑦1,𝑛 𝑘𝑒𝑦2,1 𝑘𝑒𝑦2,2 … 𝑘𝑒𝑦2,𝑛 𝑘𝑒𝑦3,1 𝑘𝑒𝑦4,1 𝑘𝑒𝑦3,2 𝑘𝑒𝑦4,2 … … 𝑘𝑒𝑦3,𝑛 𝑘𝑒𝑦4,𝑛
  • 8. Modified Hierarchical Agglomerative Clustering 𝑚𝑖𝑛 1 𝑚𝑖𝑛 2 𝑔𝑟𝑜𝑢𝑝 1 16: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦2,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛: 17: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 2 18: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦3,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛: 19: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 3 20: 𝑒𝑙𝑖𝑓 {𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖𝑛 2} 𝑎𝑛𝑦 [𝑘𝑒𝑦4,𝑛 ], 𝑤ℎ𝑒𝑟𝑒 𝑛 = 1,2, . . , 𝑛: 21: 𝐴𝑑𝑑 𝑐 𝑚𝑖𝑛 1, 𝑐 𝑚𝑖 𝑛2 𝑡𝑜 𝐶𝑔𝑟𝑜𝑢𝑝 4 22: 𝑙 = 𝑙 + 1 23: 𝒆𝒏𝒅 𝒘𝒉𝒊𝒍𝒆 24: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 1 = 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 1 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖 4 𝑖=1 25: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 2 = 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 2 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖 4 𝑖=1 26: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 3 = 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 3 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖 4 𝑖=1 27: 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝐶𝑔𝑟𝑜𝑢𝑝 4 = 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 4 𝑙𝑒𝑛𝑔𝑡 ℎ.𝐶 𝑔𝑟𝑜𝑢𝑝 𝑖 4 𝑖=1 28: set 𝐶𝑔𝑟𝑜𝑢𝑝 𝑖 which has max 𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑎𝑠 𝑢𝑠𝑒𝑟 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡𝑠
  • 9. Experimental Results Figure: Degree of Users Interests Figure: Users Interests Profile
  • 10. Experimental Results (Cont’d) • C 4.5 is the best known classification algorithm used to generate decision trees for continuous and discrete attributes. • Attributes: – Total session time – Total time a user stays at the site – Total number of accessed pages during the whole session
  • 11. Experimental Results (Cont’d) Figure : Execution time comparison with C4.5 algorithm
  • 12. Experimental Results (Cont’d) Memory Size (MB) MHAC (%) C4.5 (%) 25 83.3 76.7 50 86.7 80.0 75 93.3 80.0 100 96.7 83.3 125 96.7 86.7 Table. Accuracy comparison with C4.5 algorithm
  • 13. Conclusion and Futures Work • We proposed a modified Hierarchical Agglomerative Clustering that can automatically provides a user interests profile. • The proposed algorithms can measure degree of users’ interests and inferring particular pieces of information that they interested on it based on browsing history • Our work can outperforms the C4.5 algorithm in execution time and accuracy on all memory utilization. • In the future, we need to implement Map-Reduce algorithm on modified Hierarchical Agglomerative Clustering to enhance performance of clustering algorithm.
  • 14. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2012R1A1A2007014).