SlideShare a Scribd company logo
1 of 20
Download to read offline
Analysis of Transaction
Logs from National
Museums Liverpool
David Walsh, Mark M Hall, Paul Clough,
Frank Hopfgartner and Jonathan Foster
Edge Hill University & Martin-Luther-University Halle-Wittenberg
& Sheffield University & Peak Indicators
TPDL 2019, Oslo
Context
What to Search for ????
General Public
49.7%Non-Professionals
/ Hobbyists
26.9%
Students
6.5%
Others
5.1%
Teachers
4.9%
Academics
4.9%
Museum Staff
2%
Walsh, D., Hall, M., Clough, P., Foster, J.: The ghost in the museum website: investigating the general public’s interactions with museum websites. In:
International Conference on Theory and Practice of Digital Libraries, Springer (2017)
This study
Aim - Investigate how representative of the full website audience the survey
respondents are.
RO1 - Conduct Transaction log analysis.
RO2 - Cluster the web log data.
RO3 - Identify if the clusters represent any of the known user groups?
?=
Walsh, D., Hall, M., Clough, P., Foster, J.: The ghost in the museum website: investigating the general public’s interactions with museum websites. In:
International Conference on Theory and Practice of Digital Libraries, Springer (2017)
NML website
Experiment Overview
● Server logs extracted for Jan-Mar 2017
● User-based (multi-session) clustering of log data
● Transaction log analysis conducted
● Georeferenced the logs
● Log files cleaned
TLA Findings
● 586,868 page requests.
○ 321,174 unique users (multi-sessions groups)
Day Mon Tue Wed Thur Fri Sat Sun Total
Requests 81k 100k 101k 97k 85k 55k 66k 586,868
% 13.88 17.09 17.26 16.58 14.59 9.37 11.23 100
TLA Findings
Museum Request
ISM 97,686
Other Pages 92,433
WML 86,516
Walker 73,194
Maritime 68,912
Events 58,273
MOL 54,697
Ladylever 24,607
Shop 21,740
Sudley 8,810
Total 586,868
TLA Findings
Requests by page type.
Country Requests Queries
UK 307,347 181,903
US 120,584 43,062
Denmark 32,012 9,098
Germany 16,878 7,846
Australia 15,805 4,012UK City/Town Requests Queries
Manchester 40,992 20,696
Liverpool 37,804 23,014
London 32,012 9,098
Runcorn 16,878 7,846
Sheffield 15,805 4,012
... ... ...
Total 307,347 181,903
213
COUNTRIES
Sessionisation method
He, D., Göker, A.: Detecting session boundaries from web user logs. In Proceedings of the BCS-IRSG 22nd annual
colloquium on information retrieval research (2000) 57-66
Session results overview
10 sec
76.8% <
1 97%
{
Session results overview
0% 10% 20% 30% 40%
Page Type Entry Exit
General 110,322 114,884
Item 62,576 65,922
Museum overview 37,698 28,432
Collection overview 26,322 26,418
Event 25,856 26,647
Kids 14,125 14,087
Shop 7,950 8,983
182,185
139,163
58,273
56,675
40,546
36,694
21,740
11,019
Requests
PageTypes
Session results overview
Only 2.2% used search
Only 7,121 searches
from 586,868 requests over
321,174 sessions
https://www.liverpoolmuseums.org.uk/
URLs Classified Semi-Automatically
mol/ collections/archaeology/cheshire/knutsford/item-611992.aspx
General
Museum/Gallery
Collection
Item
Clustering Methodology
1. Cluster users not sessions.
(26 columns of data including: IP; User Agent; Location details; Total counts
for requests: session, page types visited, and query counts.)
2. Run elbow curve
3. Scale data
4. Cluster (by page type and queries counts by user)
Attempted clustering methods:
● K-means
● K-modes (k-prototypes)
● DBScan
Cluster Classification Principles
User group
characteristic
Log data
Motivation Starting level page (first page URI in session)
Domain / CH Knowledge Page type and queries
Task Page type and possibly queries
Location IP (reversed) identifying country, region and city
Frequency of visits Repeat visits (sessions), queries, length of session
Findings from preliminary clustering
Single page viewer High all round
searcher
Event visitor Single query general
page visitor
Deep level browser General museum
visitor
Known item
searcher
1.0 50 3.5 1.2
6 17.5 20
Potential mapping to known user groups
Cluster # Users Cluster label Potential user group
1 172,692 Single page viewers Currently un-documented
user-group called “Bouncers”
2 46 High all round searchers Non-Professionals (Hobbyists)
3 4,162 Event visitors Teachers / General Public
4 45,282 Single query general page
visitors
General Public (Pre-Visit) /
Teachers
5 292 Deep level browsers Museum Staff
6 290 General museum visitors General Public / Students
7 2,966 Known item searchers Academics (Experts) /
Non-Professionals (Hobbyists)
Conclusion
=
Cluster analysis indicates that the earlier survey study is
representative.
Cluster analysis extends the survey results with a
behavioural dimension
Future work
● Explore the behaviors of the clustered groups in more
detail and enhance the known user definitions.
● Extend clustering to look at other data such as location
and museum/gallery accessed.
● Explore clustering just those we think are GP and see if
sub-groups emerge.
Thank you for your attention
Link to the full paper :
https://link.springer.com/chapter/10.1007/978-3-030-30760-8_7

More Related Content

Similar to Analysis of transaction logs from National Museums Liverpool

Joining it all up: developing research-practice linkages in the UK
Joining it all up: developing research-practice linkages in the UKJoining it all up: developing research-practice linkages in the UK
Joining it all up: developing research-practice linkages in the UK Hazel Hall
 
Understanding Open Science: Definitions and framework
Understanding Open Science: Definitions and framework Understanding Open Science: Definitions and framework
Understanding Open Science: Definitions and framework Nancy Pontika
 
Introduction to the Oxford Collections Visualization Project
Introduction to the Oxford Collections Visualization ProjectIntroduction to the Oxford Collections Visualization Project
Introduction to the Oxford Collections Visualization ProjectChristine Madsen
 
The Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARNThe Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARNLEARN Project
 
Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...Joy Palmer
 
An introduction to the Digital Curation Centre
An introduction to the Digital Curation CentreAn introduction to the Digital Curation Centre
An introduction to the Digital Curation CentreMichael Day
 
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...LEARN Project
 
Dig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDART Project
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45minsDimitrios Koureas
 
July2015cooke.
July2015cooke.July2015cooke.
July2015cooke.ALISS
 
Research Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul AyrisResearch Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul AyrisLEARN Project
 
Lines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationLines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationGaz Johnson
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional RepositoriesNIFT
 
Closing plenary - John Wilkin and David Maguire
Closing plenary - John Wilkin and David MaguireClosing plenary - John Wilkin and David Maguire
Closing plenary - John Wilkin and David MaguireJisc
 
Frances Boyle- RLUK Conference 2010
Frances Boyle- RLUK Conference 2010Frances Boyle- RLUK Conference 2010
Frances Boyle- RLUK Conference 2010kerryalford86
 
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...Jennifer McCauley
 
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM ToolkitLEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM ToolkitLEARN Project
 

Similar to Analysis of transaction logs from National Museums Liverpool (20)

Joining it all up: developing research-practice linkages in the UK
Joining it all up: developing research-practice linkages in the UKJoining it all up: developing research-practice linkages in the UK
Joining it all up: developing research-practice linkages in the UK
 
Understanding Open Science: Definitions and framework
Understanding Open Science: Definitions and framework Understanding Open Science: Definitions and framework
Understanding Open Science: Definitions and framework
 
Introduction to the Oxford Collections Visualization Project
Introduction to the Oxford Collections Visualization ProjectIntroduction to the Oxford Collections Visualization Project
Introduction to the Oxford Collections Visualization Project
 
The Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARNThe Needs of Stakeholders in the RDM Process - the role of LEARN
The Needs of Stakeholders in the RDM Process - the role of LEARN
 
Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...Copac: Reengineering the UK national academic union catalogue to serve the 21...
Copac: Reengineering the UK national academic union catalogue to serve the 21...
 
Visibility and internationalization USARB Through Institutional Repository
Visibility and internationalization USARB Through Institutional Repository Visibility and internationalization USARB Through Institutional Repository
Visibility and internationalization USARB Through Institutional Repository
 
An introduction to the Digital Curation Centre
An introduction to the Digital Curation CentreAn introduction to the Digital Curation Centre
An introduction to the Digital Curation Centre
 
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
The Needs of stakeholders in the RDM process - the role of LEARN. By Paul Ayr...
 
Dig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologists
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
July2015cooke.
July2015cooke.July2015cooke.
July2015cooke.
 
Research Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul AyrisResearch Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul Ayris
 
Lines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationLines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly Publication
 
TIDSR
TIDSRTIDSR
TIDSR
 
SciTech Conference
SciTech ConferenceSciTech Conference
SciTech Conference
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional Repositories
 
Closing plenary - John Wilkin and David Maguire
Closing plenary - John Wilkin and David MaguireClosing plenary - John Wilkin and David Maguire
Closing plenary - John Wilkin and David Maguire
 
Frances Boyle- RLUK Conference 2010
Frances Boyle- RLUK Conference 2010Frances Boyle- RLUK Conference 2010
Frances Boyle- RLUK Conference 2010
 
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...
A Digital Library Initiative for Scholarly Monographs: An Activity Theory Ana...
 
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM ToolkitLEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
 

Recently uploaded

Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxStephen266013
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.pptRachmaGhifari
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationmuqadasqasim10
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样jk0tkvfv
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单aqpto5bt
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...Amil baba
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives23050636
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...siskavia95
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"John Sobanski
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 

Recently uploaded (20)

Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 

Analysis of transaction logs from National Museums Liverpool

  • 1. Analysis of Transaction Logs from National Museums Liverpool David Walsh, Mark M Hall, Paul Clough, Frank Hopfgartner and Jonathan Foster Edge Hill University & Martin-Luther-University Halle-Wittenberg & Sheffield University & Peak Indicators TPDL 2019, Oslo
  • 2. Context What to Search for ???? General Public 49.7%Non-Professionals / Hobbyists 26.9% Students 6.5% Others 5.1% Teachers 4.9% Academics 4.9% Museum Staff 2% Walsh, D., Hall, M., Clough, P., Foster, J.: The ghost in the museum website: investigating the general public’s interactions with museum websites. In: International Conference on Theory and Practice of Digital Libraries, Springer (2017)
  • 3. This study Aim - Investigate how representative of the full website audience the survey respondents are. RO1 - Conduct Transaction log analysis. RO2 - Cluster the web log data. RO3 - Identify if the clusters represent any of the known user groups? ?= Walsh, D., Hall, M., Clough, P., Foster, J.: The ghost in the museum website: investigating the general public’s interactions with museum websites. In: International Conference on Theory and Practice of Digital Libraries, Springer (2017)
  • 5. Experiment Overview ● Server logs extracted for Jan-Mar 2017 ● User-based (multi-session) clustering of log data ● Transaction log analysis conducted ● Georeferenced the logs ● Log files cleaned
  • 6. TLA Findings ● 586,868 page requests. ○ 321,174 unique users (multi-sessions groups) Day Mon Tue Wed Thur Fri Sat Sun Total Requests 81k 100k 101k 97k 85k 55k 66k 586,868 % 13.88 17.09 17.26 16.58 14.59 9.37 11.23 100
  • 7. TLA Findings Museum Request ISM 97,686 Other Pages 92,433 WML 86,516 Walker 73,194 Maritime 68,912 Events 58,273 MOL 54,697 Ladylever 24,607 Shop 21,740 Sudley 8,810 Total 586,868
  • 8. TLA Findings Requests by page type. Country Requests Queries UK 307,347 181,903 US 120,584 43,062 Denmark 32,012 9,098 Germany 16,878 7,846 Australia 15,805 4,012UK City/Town Requests Queries Manchester 40,992 20,696 Liverpool 37,804 23,014 London 32,012 9,098 Runcorn 16,878 7,846 Sheffield 15,805 4,012 ... ... ... Total 307,347 181,903 213 COUNTRIES
  • 9. Sessionisation method He, D., Göker, A.: Detecting session boundaries from web user logs. In Proceedings of the BCS-IRSG 22nd annual colloquium on information retrieval research (2000) 57-66
  • 10. Session results overview 10 sec 76.8% < 1 97% {
  • 11. Session results overview 0% 10% 20% 30% 40% Page Type Entry Exit General 110,322 114,884 Item 62,576 65,922 Museum overview 37,698 28,432 Collection overview 26,322 26,418 Event 25,856 26,647 Kids 14,125 14,087 Shop 7,950 8,983 182,185 139,163 58,273 56,675 40,546 36,694 21,740 11,019 Requests PageTypes
  • 12. Session results overview Only 2.2% used search Only 7,121 searches from 586,868 requests over 321,174 sessions
  • 13. https://www.liverpoolmuseums.org.uk/ URLs Classified Semi-Automatically mol/ collections/archaeology/cheshire/knutsford/item-611992.aspx General Museum/Gallery Collection Item
  • 14. Clustering Methodology 1. Cluster users not sessions. (26 columns of data including: IP; User Agent; Location details; Total counts for requests: session, page types visited, and query counts.) 2. Run elbow curve 3. Scale data 4. Cluster (by page type and queries counts by user) Attempted clustering methods: ● K-means ● K-modes (k-prototypes) ● DBScan
  • 15. Cluster Classification Principles User group characteristic Log data Motivation Starting level page (first page URI in session) Domain / CH Knowledge Page type and queries Task Page type and possibly queries Location IP (reversed) identifying country, region and city Frequency of visits Repeat visits (sessions), queries, length of session
  • 16. Findings from preliminary clustering Single page viewer High all round searcher Event visitor Single query general page visitor Deep level browser General museum visitor Known item searcher 1.0 50 3.5 1.2 6 17.5 20
  • 17. Potential mapping to known user groups Cluster # Users Cluster label Potential user group 1 172,692 Single page viewers Currently un-documented user-group called “Bouncers” 2 46 High all round searchers Non-Professionals (Hobbyists) 3 4,162 Event visitors Teachers / General Public 4 45,282 Single query general page visitors General Public (Pre-Visit) / Teachers 5 292 Deep level browsers Museum Staff 6 290 General museum visitors General Public / Students 7 2,966 Known item searchers Academics (Experts) / Non-Professionals (Hobbyists)
  • 18. Conclusion = Cluster analysis indicates that the earlier survey study is representative. Cluster analysis extends the survey results with a behavioural dimension
  • 19. Future work ● Explore the behaviors of the clustered groups in more detail and enhance the known user definitions. ● Extend clustering to look at other data such as location and museum/gallery accessed. ● Explore clustering just those we think are GP and see if sub-groups emerge.
  • 20. Thank you for your attention Link to the full paper : https://link.springer.com/chapter/10.1007/978-3-030-30760-8_7