SlideShare a Scribd company logo
1 of 25
Download to read offline
IRUS: from counting clicks to COUNTER stats
20 September 2022
What we will cover
• IRUS context and overview
• How does it work?
• Usage data
• Collecting
• Handling
• Processing
• Storing
• Exposing statistics using the API and examples
• What is next?
• Q&A
2 IRUS: from counting clicks to COUNTER stats - 20 September 2022
IRUS context
IRUS: Open and flexible access to comparable and standardised usage statistics
for repositories
• Based on COUNTER Code of Practice, international
standard for measuring usage of e-resources
• 199 active participating repositories across 159
organisations
• Over 17 million individual items
• Between 2M and 6M usage events received daily
IRUS
IRUS-UK
IRUS-CORE
IRUS-ANZ
IRUS-US
IRUS-OAPEN
3 IRUS: from counting clicks to COUNTER stats - 20 September 2022
High-level overview
Collect raw usage
data
• Repositories send
logs via tracker
protocol
Process into
COUNTER stats
• Filter out robots and
rogue usage and
double-clicks
• Add metadata
Enrich with
additional
information
• ORCIDs
• IRUS item types
Expose
• API based on
COUNTER SUSHI
standard
Present and
export
• Web reporting
interface
• Widget
Curate the data
4 IRUS: from counting clicks to COUNTER stats - 20 September 2022
How we collect usage data – the Tracker Protocol
• We need a standard approach to collect raw usage data when
repository pages are viewed and full content downloaded
• The Tracker Protocol
• Devised in collaboration with COUNTER
• A user* clicks on a link to an item page (i.e. views item metadata) or an associated
file (i.e. requests a download)
• An OpenURL-like log entry – a “tracker message” - is sent to a URL endpoint on the
IRUS server for further processing
• Tracker messages are stored in daily** log files
• The Tracker Protocol specification for COUNTER R5 conformance
* The ‘user’ could be a human or a machine
** The date messages are received, which isn’t necessarily the same as the date a usage event
happened
5 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Tracker Protocol Implementations
• Various software platforms underpin Institutional Repositories
• Each needs its own Tracker Protocol implementation
• Out-of-the-box standard implementations:
• DSpace, Eprints, Figshare, Haplo, Fedora-Samvera (on-the-fly, as usage occurs)
• Worktribe (batch data, previous day’s usage)
• Out-of-the-box 50% standard implementation:
• Elsevier Pure (batch data, previous day’s usage)
• Only sends data about file downloads NOT metadata views
• Bespoke standard implementations:
• CORE, Equella, Other (on-the-fly, as usage occurs)
• Esploro, Fedora-Other (batch data, previous day’s usage)
• See https://irus.jisc.ac.uk/r5/participate/implement/
6 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Processing log file usage data
• Takes place every day at 3:30am
• A scheduled task processes data in the previous day’s log files
• To put it simply:
• Gets rid of ‘rubbish’ usage data it finds in the logs
• Puts eligible usage event data into a Tracker Data table for further
processing
• It’s easier to describe more fully in a diagram . . .
7 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Daily Tracker Log Processing – scheduled process at 3:30am each day
Tracker data
- on the fly
199 repositories
Daily log
files
Tracker data
- daily batch
Processing History table
Trackers table
Repositories table
Server Authority table
Blacklisted servers table
Tracker Log Processing Script
COUNTER Robot Exclusions
Fake referrers
Malformed messages
Blacklisted servers
Messages from unknown
repositories
Unregistered
Tracker Data table
Eligible messages from
registered repositories
Monthly
Tracker Data table
Summary reports
8 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Processing Tracker Data table usage events - Daily
• A scheduled task processes data in current month’s Tracker Data table
• Task consists of a ‘controller’ script that runs a dozen other scripts, which
between them:
• Identify and eliminate usage that falls foul of IRUS exclusions*
• Harvest bibliographic metadata for items that IRUS hasn’t encountered before
• Utilises standard OAI-PMH and APIs
• Includes assigning an IRUS Item Type based on source item types exposed in metadata*
• Collect and validate ORCiDs in item metadata to populate Author Authority tables*
• Perform COUNTER R5 processing that converts usage data to Daily statistics
• See how your data has been processed in the Processing statistics report
• Time for another diagram . . .
* See later slides
9 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Daily Tracker Data Processing – scheduled process at 6:00am every day
Processing history table
Monthly Tracker Data table
Usage events that occurred two
days ago
IRUS Item Types Mapping
Rules tables
Author Authority
Candidates table
Tracker Data Processing Script
Data processing
IRUS Daily Exclusions
Summary reports
Metadata processing Item Metadata Table
Harvest metadata - OAI-PMH
Harvest OAPEN metadata - OAI-PMH
Harvest CORE metadata - API
Harvest Vivli metadata - API
Harvest Pure dataset metadata - API
Process author authority candidates
Author Authority Table
Author Authority Item
Lookup Table
Daily statistics processing
Daily eligible COUNTER data processing
Daily statistics creation
Daily Statistics Tables
Provisional statistics
10 IRUS: from counting clicks to COUNTER stats - 20 September 2022
IRUS exclusions – robot and rogue usage
• Use of the COUNTER User Agent Exclusion List
• Is the minimum COUNTER requirement for robot detection
• Works reasonably well for traditional scholarly publishers behind pay barriers
• But it’s not enough in the open access world
• Besides ‘good’ bots like Googlebot, there are
• ‘bad’ bots that don’t declare themselves as bots but are mostly harmless
• and a host of others: hackers, spammers, dictionary attackers, etc.
• In addition, based on extensive analysis of our logs, we also eliminate usage from
• IPs with 40 or more downloads in a single day
• IP/UAs with 10 or more downloads of a single item in a single day
• IP ranges grouped by the 1st three octets that have 300 or more downloads in a day
• During an audit review, the COUNTER auditors agreed that these are reasonable
extra measures to remove robotic/rogue activity from our statistics
11 IRUS: from counting clicks to COUNTER stats - 20 September 2022
IRUS & Item Types
• When we harvest item metadata from repositories, one of the fields we
capture is the dc:type field
• Describes the nature or genre of the item - article, book, thesis, etc.
• It does not describe the Subject or Format of the item
• A lack of standardisation in the use of item types when looking across
repositories
• We encounter literally thousands of terms in dc:type
• Default lists of item types provided by software platform
• Lists of item types developed by individual institutions
• Controlled vocabularies, including COAR Resource Types
• Terms that are nothing to do with ‘type’
• This isn’t very useful and is a barrier to comparability
• Hence we need an appropriate, meaningful and useful item types across the
whole of IRUS
12 IRUS: from counting clicks to COUNTER stats - 20 September 2022
IRUS Item Types Mappings
• The original set of IRUS item types was defined in 2012
• Revisited and revised a number of times
• We used a manual mapping process, which had become unsustainable
• The current set of IRUS item types was defined in July 2022
• Based on analysis of over 4 million item records
• We expanded and enhanced the list, which consists of 31 IRUS item types
• We now use an automated, programmatic solution mapping to those IRUS types
• 40+ rules derived from analysis of over 4 million item records
• For more information, see the IRUS
• Item types and mapping policy
• Item type mappings report
13 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Author Authority - ORCiDS
• When we harvest item metadata, we scan for strings that look like ORCiDs
• These are added to the Authority Candidates table
• A subsequent script processes each ORCiD candidate
• If the ORCiD isn’t already in our system
• We put out a call to the orcid.org API to validate and verify the existence of the
ORCiD, and retrieve canonical author information
• If the ORCiD is found, we update the Author Authority and Item lookup tables
• If not, the ORCiD is discarded
• If the ORCiD is already known to our system
• We just update the Item lookup table to create an association between the ORCiD
and its item
14 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Processing Tracker Data table usage events - Monthly
• A set of 24 tasks process data in the previous month’s Tracker_Data table
• e.g. on 3rd September 2022 we produced the stats for August 2022
• The tasks fall (broadly) into four categories
• Data analysis
• Building up a picture of ‘user’ activity over time
• Future improvements in robot and rogue usage detection
• Data processing
• Reprocessing IRUS exclusions across the month
• Metadata processing
• Reprocessing metadata harvesting across the month
• Monthly Statistics Processing
• Producing COUNTER conformant monthly statistics
• Time for another diagram . . .
15 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Monthly Tracker Data Processing – (will be scheduled to) run on the 3rd
of each month
Processing History table
Monthly Tracker Data table
Item Metadata Table
Author Authority
Candidates table
Tracker Data Processing Script
Summary reports
Data analysis
IP address/User Agent activity
IP address/User Agent distribution
IP/UA activity tables
Data processing
IRUS Exclusions
Metadata processing
Harvest metadata – OAI-PMH & APIs
Harvest metadata – RIOXX
Process author authority candidates
Author Authority Table
Author Authority Item
Lookup Table
Monthly statistics processing
Eligible COUNTER data processing
Monthly statistics creation
Monthly Statistics Tables
IRUS PR & IR
OAPEN PR & IR
CORE PR
16 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Metadata Curation
• Historically, we’ve only harvested metadata for an item when first
encountered
• We’d only update metadata where we knew it was necessary
• However, it’s become increasingly apparent that we should regularly
refresh our metadata records
• There are frequent changes to repository records – (un)deletions,
corrections, enhancements . . .
• We’re currently updating all item metadata following the move to
automated and updated item type mapping
• We’re implementing regular incremental harvesting to pick up
metadata changes in repository records
17 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Data Curation
• Daily statistics tables get very big, very quickly
• Performance and storage issues
• We only keep statistics for the current month and the previous two months
• Older daily statistics are deleted on a monthly basis
• We’re very mindful of GDPR requirements!
• Usage data we gather includes IP addresses
• We store that data securely – only as long as we need it
• COUNTER rules require us to keep raw usage data for the current year
plus the previous two years
• Each year we delete old log files and old records from our database, which
are no longer required
18 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Exposing statistics – IRUS Custom API
• Once the statistics are in the database we need to expose them
• We have a number of API methods to retrieve
• Daily statistics
• Item level
• Available for current month + two previous months
• Monthly statistics
• Item level and Platform level
• Available from the time we started collecting statistics for any given repository
• Formats: JSON, and tabular – CSV/TSV
• Openly available to participants and other third parties
19 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Exposing statistics – example API call
https://irus.jisc.ac.uk/api/v3/irus/reports/[report_id]/?
requestor_id=[institutional Requestor_ID]&
begin_date=[YYYY-MM | YYYY-MM-DD]&
end_date=[YYYY-MM | YYYY-MM-DD]
{& optional parameters, e.g. platform, item_id, metric_type, content_type}
Many example calls on https://irus.jisc.ac.uk/r5/embed/api/
20 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Exposing statistics – using the API
API
Excel
(CSV)
Website
(IRUS)
Website
(via widget)
21 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Exposing statistics – widget example
More information at https://irus.jisc.ac.uk/r5/embed/widget/
22 IRUS: from counting clicks to COUNTER stats - 20 September 2022
What’s happening now and next?
In progress
• Metadata refresh
• Repository size and scale
information
• Backend reporting and
monitoring
Planned
• COUNTER Release 5.1
• COUNTER Compliance Audit
• R4 stats in the Individual
Item Report
Considering
• CORE and repository usage
• Journal information
• Funder information
• Search
• Request reports by email
• Regular reports to your inbox
• Visualisations
23 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Questions
24 IRUS: from counting clicks to COUNTER stats - 20 September 2022
Contact us
Email help@jisc.ac.uk
Mention IRUS in the subject line

More Related Content

Similar to From clicks to stats: How IRUS processes repository usage data

IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
IRUS: how do I
IRUS: how do I IRUS: how do I
IRUS: how do I Jisc
 
Apache Eagle Strata Hadoop World London 2016
Apache Eagle Strata Hadoop World London 2016Apache Eagle Strata Hadoop World London 2016
Apache Eagle Strata Hadoop World London 2016Arun Karthick Manoharan
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfahmedibrahimghnnam01
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptxAlbert Alex
 
Pirus December 2011
Pirus December 2011Pirus December 2011
Pirus December 2011vanoverborgg
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisationShwetabh Jaiswal
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...Big Data Spain
 
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016cdmaxime
 
IRUS-UK: Does anyone use the material in your repository?
IRUS-UK: Does anyone use the material in your repository?IRUS-UK: Does anyone use the material in your repository?
IRUS-UK: Does anyone use the material in your repository?Repository Fringe
 
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsSatya Sanjibani Routray
 
Monitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-ApplicationsMonitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-ApplicationsSatya Sanjibani Routray
 
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsAnanth Padmanabhan
 
Monitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applicationsMonitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applicationsSatya Sanjibani Routray
 
Data in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonData in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonCisco DevNet
 
How to leverage Enterprise Architecture in a regulated environment
How to leverage Enterprise Architecture in a regulated environmentHow to leverage Enterprise Architecture in a regulated environment
How to leverage Enterprise Architecture in a regulated environmentLeanIX GmbH
 

Similar to From clicks to stats: How IRUS processes repository usage data (20)

IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
IRUS: how do I
IRUS: how do I IRUS: how do I
IRUS: how do I
 
Apache Eagle Strata Hadoop World London 2016
Apache Eagle Strata Hadoop World London 2016Apache Eagle Strata Hadoop World London 2016
Apache Eagle Strata Hadoop World London 2016
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
 
Pirus December 2011
Pirus December 2011Pirus December 2011
Pirus December 2011
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
 
IRUS-UK at Repository Fringe 2014
IRUS-UK at Repository Fringe 2014IRUS-UK at Repository Fringe 2014
IRUS-UK at Repository Fringe 2014
 
IRUS-UK: Does anyone use the material in your repository?
IRUS-UK: Does anyone use the material in your repository?IRUS-UK: Does anyone use the material in your repository?
IRUS-UK: Does anyone use the material in your repository?
 
Monitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applicationsMonitoring docker containers and dockerized applications
Monitoring docker containers and dockerized applications
 
Monitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-ApplicationsMonitoring-Docker-Container-and-Dockerized-Applications
Monitoring-Docker-Container-and-Dockerized-Applications
 
Monitoring docker container and dockerized applications
Monitoring docker container and dockerized applicationsMonitoring docker container and dockerized applications
Monitoring docker container and dockerized applications
 
Monitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applicationsMonitoring docker-container-and-dockerized-applications
Monitoring docker-container-and-dockerized-applications
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Shepherd pirus april 2013
Shepherd pirus april 2013Shepherd pirus april 2013
Shepherd pirus april 2013
 
Data in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonData in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathon
 
How to leverage Enterprise Architecture in a regulated environment
How to leverage Enterprise Architecture in a regulated environmentHow to leverage Enterprise Architecture in a regulated environment
How to leverage Enterprise Architecture in a regulated environment
 
Microstrategy Overview
Microstrategy OverviewMicrostrategy Overview
Microstrategy Overview
 

More from Jisc

International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...Jisc
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxJisc
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxJisc
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Jisc
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...Jisc
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptxJisc
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxJisc
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxJisc
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxJisc
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJisc
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxJisc
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber EssentialsJisc
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptxJisc
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptxJisc
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxJisc
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptxJisc
 
ExpertsknightOct23.pptx
ExpertsknightOct23.pptxExpertsknightOct23.pptx
ExpertsknightOct23.pptxJisc
 

More from Jisc (20)

International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...International students’ digital experience: understanding and mitigating the ...
International students’ digital experience: understanding and mitigating the ...
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Open Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptxOpen Access book publishing understanding your options (1).pptx
Open Access book publishing understanding your options (1).pptx
 
Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...Scottish Universities Press supporting authors with requirements for open acc...
Scottish Universities Press supporting authors with requirements for open acc...
 
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...How Bloomsbury is supporting authors with UKRI long-form open access requirem...
How Bloomsbury is supporting authors with UKRI long-form open access requirem...
 
Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023Jisc Northern Ireland Strategy Forum 2023
Jisc Northern Ireland Strategy Forum 2023
 
Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023Jisc Scotland Strategy Forum 2023
Jisc Scotland Strategy Forum 2023
 
Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023Jisc stakeholder strategic update 2023
Jisc stakeholder strategic update 2023
 
JISC Presentation.pptx
JISC Presentation.pptxJISC Presentation.pptx
JISC Presentation.pptx
 
Community-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptxCommunity-led Open Access Publishing webinar.pptx
Community-led Open Access Publishing webinar.pptx
 
The Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptxThe Open Access Community Framework (OACF) 2023 (1).pptx
The Open Access Community Framework (OACF) 2023 (1).pptx
 
Are we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
 
JiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptx
 
UWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptx
 
An introduction to Cyber Essentials
An introduction to Cyber EssentialsAn introduction to Cyber Essentials
An introduction to Cyber Essentials
 
MarkChilds.pptx
MarkChilds.pptxMarkChilds.pptx
MarkChilds.pptx
 
RStrachanOct23.pptx
RStrachanOct23.pptxRStrachanOct23.pptx
RStrachanOct23.pptx
 
ISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptxISDX2 Oct 2023 .pptx
ISDX2 Oct 2023 .pptx
 
FerrellWalker.pptx
FerrellWalker.pptxFerrellWalker.pptx
FerrellWalker.pptx
 
ExpertsknightOct23.pptx
ExpertsknightOct23.pptxExpertsknightOct23.pptx
ExpertsknightOct23.pptx
 

Recently uploaded

Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 

Recently uploaded (20)

Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 

From clicks to stats: How IRUS processes repository usage data

  • 1. IRUS: from counting clicks to COUNTER stats 20 September 2022
  • 2. What we will cover • IRUS context and overview • How does it work? • Usage data • Collecting • Handling • Processing • Storing • Exposing statistics using the API and examples • What is next? • Q&A 2 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 3. IRUS context IRUS: Open and flexible access to comparable and standardised usage statistics for repositories • Based on COUNTER Code of Practice, international standard for measuring usage of e-resources • 199 active participating repositories across 159 organisations • Over 17 million individual items • Between 2M and 6M usage events received daily IRUS IRUS-UK IRUS-CORE IRUS-ANZ IRUS-US IRUS-OAPEN 3 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 4. High-level overview Collect raw usage data • Repositories send logs via tracker protocol Process into COUNTER stats • Filter out robots and rogue usage and double-clicks • Add metadata Enrich with additional information • ORCIDs • IRUS item types Expose • API based on COUNTER SUSHI standard Present and export • Web reporting interface • Widget Curate the data 4 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 5. How we collect usage data – the Tracker Protocol • We need a standard approach to collect raw usage data when repository pages are viewed and full content downloaded • The Tracker Protocol • Devised in collaboration with COUNTER • A user* clicks on a link to an item page (i.e. views item metadata) or an associated file (i.e. requests a download) • An OpenURL-like log entry – a “tracker message” - is sent to a URL endpoint on the IRUS server for further processing • Tracker messages are stored in daily** log files • The Tracker Protocol specification for COUNTER R5 conformance * The ‘user’ could be a human or a machine ** The date messages are received, which isn’t necessarily the same as the date a usage event happened 5 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 6. Tracker Protocol Implementations • Various software platforms underpin Institutional Repositories • Each needs its own Tracker Protocol implementation • Out-of-the-box standard implementations: • DSpace, Eprints, Figshare, Haplo, Fedora-Samvera (on-the-fly, as usage occurs) • Worktribe (batch data, previous day’s usage) • Out-of-the-box 50% standard implementation: • Elsevier Pure (batch data, previous day’s usage) • Only sends data about file downloads NOT metadata views • Bespoke standard implementations: • CORE, Equella, Other (on-the-fly, as usage occurs) • Esploro, Fedora-Other (batch data, previous day’s usage) • See https://irus.jisc.ac.uk/r5/participate/implement/ 6 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 7. Processing log file usage data • Takes place every day at 3:30am • A scheduled task processes data in the previous day’s log files • To put it simply: • Gets rid of ‘rubbish’ usage data it finds in the logs • Puts eligible usage event data into a Tracker Data table for further processing • It’s easier to describe more fully in a diagram . . . 7 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 8. Daily Tracker Log Processing – scheduled process at 3:30am each day Tracker data - on the fly 199 repositories Daily log files Tracker data - daily batch Processing History table Trackers table Repositories table Server Authority table Blacklisted servers table Tracker Log Processing Script COUNTER Robot Exclusions Fake referrers Malformed messages Blacklisted servers Messages from unknown repositories Unregistered Tracker Data table Eligible messages from registered repositories Monthly Tracker Data table Summary reports 8 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 9. Processing Tracker Data table usage events - Daily • A scheduled task processes data in current month’s Tracker Data table • Task consists of a ‘controller’ script that runs a dozen other scripts, which between them: • Identify and eliminate usage that falls foul of IRUS exclusions* • Harvest bibliographic metadata for items that IRUS hasn’t encountered before • Utilises standard OAI-PMH and APIs • Includes assigning an IRUS Item Type based on source item types exposed in metadata* • Collect and validate ORCiDs in item metadata to populate Author Authority tables* • Perform COUNTER R5 processing that converts usage data to Daily statistics • See how your data has been processed in the Processing statistics report • Time for another diagram . . . * See later slides 9 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 10. Daily Tracker Data Processing – scheduled process at 6:00am every day Processing history table Monthly Tracker Data table Usage events that occurred two days ago IRUS Item Types Mapping Rules tables Author Authority Candidates table Tracker Data Processing Script Data processing IRUS Daily Exclusions Summary reports Metadata processing Item Metadata Table Harvest metadata - OAI-PMH Harvest OAPEN metadata - OAI-PMH Harvest CORE metadata - API Harvest Vivli metadata - API Harvest Pure dataset metadata - API Process author authority candidates Author Authority Table Author Authority Item Lookup Table Daily statistics processing Daily eligible COUNTER data processing Daily statistics creation Daily Statistics Tables Provisional statistics 10 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 11. IRUS exclusions – robot and rogue usage • Use of the COUNTER User Agent Exclusion List • Is the minimum COUNTER requirement for robot detection • Works reasonably well for traditional scholarly publishers behind pay barriers • But it’s not enough in the open access world • Besides ‘good’ bots like Googlebot, there are • ‘bad’ bots that don’t declare themselves as bots but are mostly harmless • and a host of others: hackers, spammers, dictionary attackers, etc. • In addition, based on extensive analysis of our logs, we also eliminate usage from • IPs with 40 or more downloads in a single day • IP/UAs with 10 or more downloads of a single item in a single day • IP ranges grouped by the 1st three octets that have 300 or more downloads in a day • During an audit review, the COUNTER auditors agreed that these are reasonable extra measures to remove robotic/rogue activity from our statistics 11 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 12. IRUS & Item Types • When we harvest item metadata from repositories, one of the fields we capture is the dc:type field • Describes the nature or genre of the item - article, book, thesis, etc. • It does not describe the Subject or Format of the item • A lack of standardisation in the use of item types when looking across repositories • We encounter literally thousands of terms in dc:type • Default lists of item types provided by software platform • Lists of item types developed by individual institutions • Controlled vocabularies, including COAR Resource Types • Terms that are nothing to do with ‘type’ • This isn’t very useful and is a barrier to comparability • Hence we need an appropriate, meaningful and useful item types across the whole of IRUS 12 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 13. IRUS Item Types Mappings • The original set of IRUS item types was defined in 2012 • Revisited and revised a number of times • We used a manual mapping process, which had become unsustainable • The current set of IRUS item types was defined in July 2022 • Based on analysis of over 4 million item records • We expanded and enhanced the list, which consists of 31 IRUS item types • We now use an automated, programmatic solution mapping to those IRUS types • 40+ rules derived from analysis of over 4 million item records • For more information, see the IRUS • Item types and mapping policy • Item type mappings report 13 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 14. Author Authority - ORCiDS • When we harvest item metadata, we scan for strings that look like ORCiDs • These are added to the Authority Candidates table • A subsequent script processes each ORCiD candidate • If the ORCiD isn’t already in our system • We put out a call to the orcid.org API to validate and verify the existence of the ORCiD, and retrieve canonical author information • If the ORCiD is found, we update the Author Authority and Item lookup tables • If not, the ORCiD is discarded • If the ORCiD is already known to our system • We just update the Item lookup table to create an association between the ORCiD and its item 14 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 15. Processing Tracker Data table usage events - Monthly • A set of 24 tasks process data in the previous month’s Tracker_Data table • e.g. on 3rd September 2022 we produced the stats for August 2022 • The tasks fall (broadly) into four categories • Data analysis • Building up a picture of ‘user’ activity over time • Future improvements in robot and rogue usage detection • Data processing • Reprocessing IRUS exclusions across the month • Metadata processing • Reprocessing metadata harvesting across the month • Monthly Statistics Processing • Producing COUNTER conformant monthly statistics • Time for another diagram . . . 15 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 16. Monthly Tracker Data Processing – (will be scheduled to) run on the 3rd of each month Processing History table Monthly Tracker Data table Item Metadata Table Author Authority Candidates table Tracker Data Processing Script Summary reports Data analysis IP address/User Agent activity IP address/User Agent distribution IP/UA activity tables Data processing IRUS Exclusions Metadata processing Harvest metadata – OAI-PMH & APIs Harvest metadata – RIOXX Process author authority candidates Author Authority Table Author Authority Item Lookup Table Monthly statistics processing Eligible COUNTER data processing Monthly statistics creation Monthly Statistics Tables IRUS PR & IR OAPEN PR & IR CORE PR 16 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 17. Metadata Curation • Historically, we’ve only harvested metadata for an item when first encountered • We’d only update metadata where we knew it was necessary • However, it’s become increasingly apparent that we should regularly refresh our metadata records • There are frequent changes to repository records – (un)deletions, corrections, enhancements . . . • We’re currently updating all item metadata following the move to automated and updated item type mapping • We’re implementing regular incremental harvesting to pick up metadata changes in repository records 17 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 18. Data Curation • Daily statistics tables get very big, very quickly • Performance and storage issues • We only keep statistics for the current month and the previous two months • Older daily statistics are deleted on a monthly basis • We’re very mindful of GDPR requirements! • Usage data we gather includes IP addresses • We store that data securely – only as long as we need it • COUNTER rules require us to keep raw usage data for the current year plus the previous two years • Each year we delete old log files and old records from our database, which are no longer required 18 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 19. Exposing statistics – IRUS Custom API • Once the statistics are in the database we need to expose them • We have a number of API methods to retrieve • Daily statistics • Item level • Available for current month + two previous months • Monthly statistics • Item level and Platform level • Available from the time we started collecting statistics for any given repository • Formats: JSON, and tabular – CSV/TSV • Openly available to participants and other third parties 19 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 20. Exposing statistics – example API call https://irus.jisc.ac.uk/api/v3/irus/reports/[report_id]/? requestor_id=[institutional Requestor_ID]& begin_date=[YYYY-MM | YYYY-MM-DD]& end_date=[YYYY-MM | YYYY-MM-DD] {& optional parameters, e.g. platform, item_id, metric_type, content_type} Many example calls on https://irus.jisc.ac.uk/r5/embed/api/ 20 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 21. Exposing statistics – using the API API Excel (CSV) Website (IRUS) Website (via widget) 21 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 22. Exposing statistics – widget example More information at https://irus.jisc.ac.uk/r5/embed/widget/ 22 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 23. What’s happening now and next? In progress • Metadata refresh • Repository size and scale information • Backend reporting and monitoring Planned • COUNTER Release 5.1 • COUNTER Compliance Audit • R4 stats in the Individual Item Report Considering • CORE and repository usage • Journal information • Funder information • Search • Request reports by email • Regular reports to your inbox • Visualisations 23 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 24. Questions 24 IRUS: from counting clicks to COUNTER stats - 20 September 2022
  • 25. Contact us Email help@jisc.ac.uk Mention IRUS in the subject line