SlideShare a Scribd company logo
1 of 38
Visualizing Open Access
building a scalable infrastructure to
showcase the reach of MIT research
Background
Background
March 18, 2009 - Open Access Policy adopted
“...The policy is to take effect immediately; it will be reviewed after five years by
the Faculty Policy Committee, with a report presented to the Faculty.”
Background
March 18, 2009 - Open Access Policy adopted
“...The policy is to take effect immediately; it will be reviewed after five years by
the Faculty Policy Committee, with a report presented to the Faculty.”
2009 – 2013
MIT Libraries assemble a collection within Dspace@MIT for Open Access
Articles.
~15,000 articles, ~ 1 million downloads
Background
~15,000 articles, ~ 1 million downloads, but…
Author-level information?
Department-level information?
Background
March 18, 2009 - Open Access Policy adopted
“[P]olicy … will be reviewed after five years…”
August 2013 - Project begins
“Implement author-level, article-level, and aggregated article download usage
statistics for articles in the Open Access Articles Collection in DSpace@MIT to
incentivize deposits and provide useful assessment information for the MIT
Faculty Open Access Policy.”
Prior Work
Prior Work
MyDASH provided solid model…
• Map
• Timeline
• Summary table
Prior Work
MyDASH provided solid model…
• Map
• Timeline
• Summary table
… but couldn’t be directly implemented.
• Repository versus One Collection
• Multiple department affiliations
Project Goals
• Make available download statistics at three levels:
author, article, and aggregate
• Incentivize deposits to collection
• Provide useful information for policy evaluation
• Evaluate new technologies within the Libraries (i.e.
MongoDB)
Pipeline
Two-part project
Data processing pipeline
https://github.com/MITLibraries/oastats-backend
Visualization interface
https://github.com/MITLibraries/oastats-ui
Pipeline
https://github.com/MITLibraries/oastats-backend
• Apache logs
• Python
• DSpace
• GeoIP
• SOLR
Pipeline
Start from Apache server logs
● Filter the qualifying downloads
● Look up the downloaded paper
● Augment with additional information
● Store in MongoDB
● Use SOLR to build summary collection
UI queries summary collection
Pipeline
Pipeline challenges
Pipeline challenges - authors
Author identities
● Field-specific naming conventions
o “Abelson, Hal”
o “Abelson, H”
o “Hal Abelson”
● Common names
Pipeline challenges - authors
[
{
"mitid": “3.1415926537",
"name": "Cohen-Tanugi, David"
},
{
"mitid": “2.7182818",
"name": "Dave, Shreya H."
},
{
"mitid": “6.02x10^23",
"name": "Grossman, Jeffrey C."
},
{
"mitid": “1123581322",
"name": "Lienhard, John H."
},
{
"mitid": “1234567890",
"name": "McGovern, Ronan Killian"
}
]
Pipeline challenges - authors
Pipeline challenges - departments
Department names
● Inconsistent program / department affiliations
o “Media Laboratory”
o “Center for Bits and Atoms”
● Spelling Variations
o “MIT Department of Physics”
o “Massachusetts Institute of Technology, Department of Physics”
o “Dept. of Physics”
o “Physics”
Pipeline challenges - departments
Standardized department names
Whitelist of recognized names
{
"_id" : ObjectId("5449127895b0c25083f29352"),
"status" : "200",
"handle" : "http://hdl.handle.net/1721.1/52491",
"title" : "A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors",
"country" : "USA",
"authors" : [
{
"mitid" : "3.1415926537",
"name" : "Fee, Michale S."
},
{
"mitid" : "6.02x10^23",
"name" : "Andalman, Aaron S."
}
],
"request" : "/openaccess-disseminate/1721.1/52491",
"referer" : "http://www.google.com/search?q=head+mounted+microphone+zebra+finch&ie=utf-8&oe=utf-8&aq=t&rls=org.m
ozilla:en-US:official&client=firefox-a",
"user_agent" : "Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8",
"time" : ISODate("2010-08-10T17:14:03Z"),
"ip_address" : "128.218.64.242",
"dlcs" : [
{
"display" : "McGovern Institute for Brain Research at MIT",
"canonical" : "McGovern Institute for Brain Research at MIT"
},
{
"display" : "Brain and Cognitive Sciences",
"canonical" : "Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
{
"_id" : "Overall",
"countries" : [
{
"country" : "862",
"downloads" : 35
} …
],
"dates" : [
{
"date" : "2014-01-07",
"downloads" : 3
} …
],
"downloads" : 10000,
"size" : 101,
"type" : "overall"
}
Web interface
Web interface
https://github.com/MITLibraries/oastats-ui
● Mongo-backed
● PHP
● DataTables
● D3.js
● DataMaps
Web interface
Web interface
Web interface
Web interface
Web interface
Email to authors
Email to authors
Dear {name},
Thank you for sharing your scholarly articles through the open repository DSpace@MIT <https://dspace.mit.edu/handle/1721.1/49433/>, in association with the MIT Faculty Open
Access Policy <https://libraries.mit.edu/oapolicy>.
Our newly implemented OA Stats Service provides data about the use and reach of our open access collection. Since August 2010, 15,184 articles have been downloaded from
227 different countries.
This service also provides information at the author and article level:
Your 3 articles have been downloaded 168 times since they were deposited, from 28 different countries.
You can access more detailed download information about your articles, including per-article and per-country downloads at <https://oastats.mit.edu>.
Initially, we plan to provide this information to all authors via email in the Fall and Spring semesters. As we seek to improve the service, we'll consider expanding options to
interact with it and the underlying data.
We are anxious to hear your feedback on how this service can be most useful to you, so please send your suggestions to oastats@mit.edu.
--From the MIT Libraries
Email to authors
Email to authors
Faculty reception
Excitement
● “Thank you for the update, this is a fantastic tool!!”
● “Thanks so much for doing this - it's really cool and awesome!”
Why not more?
● “Hi, I like your feedback. But I am puzzled that only one of my articles is in
your database.”
● Department heads using this as leverage to encourage further
contributions
Project goals revisited
• Make available download statistics at three levels:
author, article, and aggregate
• Incentivize deposits to collection
• Provide useful information for policy evaluation
• Evaluate new technologies within the Libraries (i.e.
MongoDB)
Future work
● Automate the pipeline
● Run pipeline more frequently
● Ditch Mongo for something relational
● Talk to faculty about making more detailed information
public
● Add functionality to UI (additional format exports, move
to SPA)
● Improve cataloging in DSpace@MIT with lookup
services
Thanks!
Matt Bernhardt
mjbernha@mit.edu
@morphosis7
https://github.com/MITLibraries/oastats-backend
https://github.com/MITLibraries/oastats-ui
http://oastats.mit.edu

More Related Content

What's hot

Web search-metrics-tutorial-www2010-section-5of7-discovery
Web search-metrics-tutorial-www2010-section-5of7-discoveryWeb search-metrics-tutorial-www2010-section-5of7-discovery
Web search-metrics-tutorial-www2010-section-5of7-discoveryAli Dasdan
 
OER Talk @ University of Maryland #OAweek
OER Talk @ University of Maryland #OAweekOER Talk @ University of Maryland #OAweek
OER Talk @ University of Maryland #OAweekNicole Allen
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Datadaveyp
 
Library Analytics and Metrics Project
Library Analytics and Metrics Project Library Analytics and Metrics Project
Library Analytics and Metrics Project Ben Showers
 
OpenEdition Freemium as sustainable economic model for humanities and social ...
OpenEdition Freemium as sustainable economic model for humanities and social ...OpenEdition Freemium as sustainable economic model for humanities and social ...
OpenEdition Freemium as sustainable economic model for humanities and social ...OpenEdition
 
Programme presentation (library systems)
Programme presentation (library systems)Programme presentation (library systems)
Programme presentation (library systems)Ben Showers
 

What's hot (8)

Web search-metrics-tutorial-www2010-section-5of7-discovery
Web search-metrics-tutorial-www2010-section-5of7-discoveryWeb search-metrics-tutorial-www2010-section-5of7-discovery
Web search-metrics-tutorial-www2010-section-5of7-discovery
 
Metrics Matter
Metrics MatterMetrics Matter
Metrics Matter
 
OER Talk @ University of Maryland #OAweek
OER Talk @ University of Maryland #OAweekOER Talk @ University of Maryland #OAweek
OER Talk @ University of Maryland #OAweek
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
 
Library Analytics and Metrics Project
Library Analytics and Metrics Project Library Analytics and Metrics Project
Library Analytics and Metrics Project
 
OpenEdition Freemium as sustainable economic model for humanities and social ...
OpenEdition Freemium as sustainable economic model for humanities and social ...OpenEdition Freemium as sustainable economic model for humanities and social ...
OpenEdition Freemium as sustainable economic model for humanities and social ...
 
Programme presentation (library systems)
Programme presentation (library systems)Programme presentation (library systems)
Programme presentation (library systems)
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 

Viewers also liked

Scott Jones, ChaCha - Think LA Mobile Breakfast
Scott Jones, ChaCha - Think LA Mobile Breakfast Scott Jones, ChaCha - Think LA Mobile Breakfast
Scott Jones, ChaCha - Think LA Mobile Breakfast ChaChaSlide
 
Supporting a Dynamic Learning Environment with Custom Application Development
Supporting a Dynamic Learning Environment with Custom Application DevelopmentSupporting a Dynamic Learning Environment with Custom Application Development
Supporting a Dynamic Learning Environment with Custom Application DevelopmentMatt Bernhardt
 
WordPress Under Control
WordPress Under ControlWordPress Under Control
WordPress Under ControlMatt Bernhardt
 
Sample Pictures
Sample PicturesSample Pictures
Sample Picturesjun_acu
 
Father's day
Father's dayFather's day
Father's dayrosejana
 
Elements of dance
Elements of danceElements of dance
Elements of dancetmnelsonky
 
A detailed lesson plan in p
A detailed lesson plan in pA detailed lesson plan in p
A detailed lesson plan in pKirck Anierdes
 
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PE
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PEK TO 12 GRADE 9 LEARNER’S MATERIAL IN PE
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PELiGhT ArOhL
 
K to 12 - Grade 7 Physical Education
K to 12 - Grade 7 Physical EducationK to 12 - Grade 7 Physical Education
K to 12 - Grade 7 Physical EducationNico Granada
 
Detailed Lesson Plan (5A's)
Detailed Lesson Plan (5A's)Detailed Lesson Plan (5A's)
Detailed Lesson Plan (5A's)EMT
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
 

Viewers also liked (13)

Scott Jones, ChaCha - Think LA Mobile Breakfast
Scott Jones, ChaCha - Think LA Mobile Breakfast Scott Jones, ChaCha - Think LA Mobile Breakfast
Scott Jones, ChaCha - Think LA Mobile Breakfast
 
Supporting a Dynamic Learning Environment with Custom Application Development
Supporting a Dynamic Learning Environment with Custom Application DevelopmentSupporting a Dynamic Learning Environment with Custom Application Development
Supporting a Dynamic Learning Environment with Custom Application Development
 
WordPress Under Control
WordPress Under ControlWordPress Under Control
WordPress Under Control
 
Sample Pictures
Sample PicturesSample Pictures
Sample Pictures
 
Father's day
Father's dayFather's day
Father's day
 
Physical Education Lesson Plan
Physical Education Lesson PlanPhysical Education Lesson Plan
Physical Education Lesson Plan
 
Elements of dance
Elements of danceElements of dance
Elements of dance
 
A detailed lesson plan in p
A detailed lesson plan in pA detailed lesson plan in p
A detailed lesson plan in p
 
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PE
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PEK TO 12 GRADE 9 LEARNER’S MATERIAL IN PE
K TO 12 GRADE 9 LEARNER’S MATERIAL IN PE
 
K to 12 - Grade 7 Physical Education
K to 12 - Grade 7 Physical EducationK to 12 - Grade 7 Physical Education
K to 12 - Grade 7 Physical Education
 
Detailed Lesson Plan
Detailed Lesson PlanDetailed Lesson Plan
Detailed Lesson Plan
 
Detailed Lesson Plan (5A's)
Detailed Lesson Plan (5A's)Detailed Lesson Plan (5A's)
Detailed Lesson Plan (5A's)
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Similar to Visualizing Open Access - 2015 Code4Lib Northeast

A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...NASIG
 
Bythebook 2016 basaglia
Bythebook 2016 basagliaBythebook 2016 basaglia
Bythebook 2016 basagliaCERN
 
Public engagement while you sleep
Public engagement while you sleepPublic engagement while you sleep
Public engagement while you sleepUoLResearchSupport
 
Public engagement while you sleep? How altmetrics can help researchers broade...
Public engagement while you sleep? How altmetrics can help researchers broade...Public engagement while you sleep? How altmetrics can help researchers broade...
Public engagement while you sleep? How altmetrics can help researchers broade...UoLResearchSupport
 
Public engagement while you sleep
Public engagement while you sleep Public engagement while you sleep
Public engagement while you sleep Kirsten Thompson
 
Researchers guide March 2014
Researchers guide March 2014Researchers guide March 2014
Researchers guide March 2014EISLibrarian
 
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxAssignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxsalmonpybus
 
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxAssignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxfelicitytaft14745
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)John Towns
 
Data-Informed Decision Making for Libraries - Athenaeum21
Data-Informed Decision Making for Libraries - Athenaeum21Data-Informed Decision Making for Libraries - Athenaeum21
Data-Informed Decision Making for Libraries - Athenaeum21Megan Hurst
 
Data-Informed Decision Making for Digital Resources
Data-Informed Decision Making for Digital ResourcesData-Informed Decision Making for Digital Resources
Data-Informed Decision Making for Digital ResourcesChristine Madsen
 
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...Brandon Muramatsu
 
Imperial College London - journey to open scholarship
Imperial College London - journey to open scholarshipImperial College London - journey to open scholarship
Imperial College London - journey to open scholarshipTorsten Reimer
 
LSST DM/Community Interaction Strategy
LSST DM/Community Interaction StrategyLSST DM/Community Interaction Strategy
LSST DM/Community Interaction StrategyMario Juric
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
Researchers Guide Jan 2019
Researchers Guide Jan 2019Researchers Guide Jan 2019
Researchers Guide Jan 2019EISLibrarian
 
Building and Managing Social Media Collections
Building and Managing Social Media CollectionsBuilding and Managing Social Media Collections
Building and Managing Social Media CollectionsJason Casden
 

Similar to Visualizing Open Access - 2015 Code4Lib Northeast (20)

A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...
 
Bythebook 2016 basaglia
Bythebook 2016 basagliaBythebook 2016 basaglia
Bythebook 2016 basaglia
 
Public engagement while you sleep
Public engagement while you sleepPublic engagement while you sleep
Public engagement while you sleep
 
Public engagement while you sleep? How altmetrics can help researchers broade...
Public engagement while you sleep? How altmetrics can help researchers broade...Public engagement while you sleep? How altmetrics can help researchers broade...
Public engagement while you sleep? How altmetrics can help researchers broade...
 
Public engagement while you sleep
Public engagement while you sleep Public engagement while you sleep
Public engagement while you sleep
 
Open access investment at the local level
Open access investment at the local levelOpen access investment at the local level
Open access investment at the local level
 
Mapping dh through heterogeneous communicative practices
Mapping dh through heterogeneous communicative practicesMapping dh through heterogeneous communicative practices
Mapping dh through heterogeneous communicative practices
 
TIDSR
TIDSRTIDSR
TIDSR
 
Researchers guide March 2014
Researchers guide March 2014Researchers guide March 2014
Researchers guide March 2014
 
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxAssignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
 
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docxAssignment 1 Project Proposal Initiating & PlanningNote This .docx
Assignment 1 Project Proposal Initiating & PlanningNote This .docx
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)
 
Data-Informed Decision Making for Libraries - Athenaeum21
Data-Informed Decision Making for Libraries - Athenaeum21Data-Informed Decision Making for Libraries - Athenaeum21
Data-Informed Decision Making for Libraries - Athenaeum21
 
Data-Informed Decision Making for Digital Resources
Data-Informed Decision Making for Digital ResourcesData-Informed Decision Making for Digital Resources
Data-Informed Decision Making for Digital Resources
 
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...
Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and ...
 
Imperial College London - journey to open scholarship
Imperial College London - journey to open scholarshipImperial College London - journey to open scholarship
Imperial College London - journey to open scholarship
 
LSST DM/Community Interaction Strategy
LSST DM/Community Interaction StrategyLSST DM/Community Interaction Strategy
LSST DM/Community Interaction Strategy
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
Researchers Guide Jan 2019
Researchers Guide Jan 2019Researchers Guide Jan 2019
Researchers Guide Jan 2019
 
Building and Managing Social Media Collections
Building and Managing Social Media CollectionsBuilding and Managing Social Media Collections
Building and Managing Social Media Collections
 

Recently uploaded

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 

Recently uploaded (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 

Visualizing Open Access - 2015 Code4Lib Northeast

  • 1. Visualizing Open Access building a scalable infrastructure to showcase the reach of MIT research
  • 3. Background March 18, 2009 - Open Access Policy adopted “...The policy is to take effect immediately; it will be reviewed after five years by the Faculty Policy Committee, with a report presented to the Faculty.”
  • 4. Background March 18, 2009 - Open Access Policy adopted “...The policy is to take effect immediately; it will be reviewed after five years by the Faculty Policy Committee, with a report presented to the Faculty.” 2009 – 2013 MIT Libraries assemble a collection within Dspace@MIT for Open Access Articles. ~15,000 articles, ~ 1 million downloads
  • 5. Background ~15,000 articles, ~ 1 million downloads, but… Author-level information? Department-level information?
  • 6. Background March 18, 2009 - Open Access Policy adopted “[P]olicy … will be reviewed after five years…” August 2013 - Project begins “Implement author-level, article-level, and aggregated article download usage statistics for articles in the Open Access Articles Collection in DSpace@MIT to incentivize deposits and provide useful assessment information for the MIT Faculty Open Access Policy.”
  • 8. Prior Work MyDASH provided solid model… • Map • Timeline • Summary table
  • 9. Prior Work MyDASH provided solid model… • Map • Timeline • Summary table … but couldn’t be directly implemented. • Repository versus One Collection • Multiple department affiliations
  • 10. Project Goals • Make available download statistics at three levels: author, article, and aggregate • Incentivize deposits to collection • Provide useful information for policy evaluation • Evaluate new technologies within the Libraries (i.e. MongoDB)
  • 12. Two-part project Data processing pipeline https://github.com/MITLibraries/oastats-backend Visualization interface https://github.com/MITLibraries/oastats-ui
  • 14. Pipeline Start from Apache server logs ● Filter the qualifying downloads ● Look up the downloaded paper ● Augment with additional information ● Store in MongoDB ● Use SOLR to build summary collection UI queries summary collection
  • 17. Pipeline challenges - authors Author identities ● Field-specific naming conventions o “Abelson, Hal” o “Abelson, H” o “Hal Abelson” ● Common names
  • 19. [ { "mitid": “3.1415926537", "name": "Cohen-Tanugi, David" }, { "mitid": “2.7182818", "name": "Dave, Shreya H." }, { "mitid": “6.02x10^23", "name": "Grossman, Jeffrey C." }, { "mitid": “1123581322", "name": "Lienhard, John H." }, { "mitid": “1234567890", "name": "McGovern, Ronan Killian" } ] Pipeline challenges - authors
  • 20. Pipeline challenges - departments Department names ● Inconsistent program / department affiliations o “Media Laboratory” o “Center for Bits and Atoms” ● Spelling Variations o “MIT Department of Physics” o “Massachusetts Institute of Technology, Department of Physics” o “Dept. of Physics” o “Physics”
  • 21. Pipeline challenges - departments Standardized department names Whitelist of recognized names
  • 22. { "_id" : ObjectId("5449127895b0c25083f29352"), "status" : "200", "handle" : "http://hdl.handle.net/1721.1/52491", "title" : "A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors", "country" : "USA", "authors" : [ { "mitid" : "3.1415926537", "name" : "Fee, Michale S." }, { "mitid" : "6.02x10^23", "name" : "Andalman, Aaron S." } ], "request" : "/openaccess-disseminate/1721.1/52491", "referer" : "http://www.google.com/search?q=head+mounted+microphone+zebra+finch&ie=utf-8&oe=utf-8&aq=t&rls=org.m ozilla:en-US:official&client=firefox-a", "user_agent" : "Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8", "time" : ISODate("2010-08-10T17:14:03Z"), "ip_address" : "128.218.64.242", "dlcs" : [ { "display" : "McGovern Institute for Brain Research at MIT", "canonical" : "McGovern Institute for Brain Research at MIT" }, { "display" : "Brain and Cognitive Sciences", "canonical" : "Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
  • 23. { "_id" : "Overall", "countries" : [ { "country" : "862", "downloads" : 35 } … ], "dates" : [ { "date" : "2014-01-07", "downloads" : 3 } … ], "downloads" : 10000, "size" : 101, "type" : "overall" }
  • 32. Email to authors Dear {name}, Thank you for sharing your scholarly articles through the open repository DSpace@MIT <https://dspace.mit.edu/handle/1721.1/49433/>, in association with the MIT Faculty Open Access Policy <https://libraries.mit.edu/oapolicy>. Our newly implemented OA Stats Service provides data about the use and reach of our open access collection. Since August 2010, 15,184 articles have been downloaded from 227 different countries. This service also provides information at the author and article level: Your 3 articles have been downloaded 168 times since they were deposited, from 28 different countries. You can access more detailed download information about your articles, including per-article and per-country downloads at <https://oastats.mit.edu>. Initially, we plan to provide this information to all authors via email in the Fall and Spring semesters. As we seek to improve the service, we'll consider expanding options to interact with it and the underlying data. We are anxious to hear your feedback on how this service can be most useful to you, so please send your suggestions to oastats@mit.edu. --From the MIT Libraries
  • 35. Faculty reception Excitement ● “Thank you for the update, this is a fantastic tool!!” ● “Thanks so much for doing this - it's really cool and awesome!” Why not more? ● “Hi, I like your feedback. But I am puzzled that only one of my articles is in your database.” ● Department heads using this as leverage to encourage further contributions
  • 36. Project goals revisited • Make available download statistics at three levels: author, article, and aggregate • Incentivize deposits to collection • Provide useful information for policy evaluation • Evaluate new technologies within the Libraries (i.e. MongoDB)
  • 37. Future work ● Automate the pipeline ● Run pipeline more frequently ● Ditch Mongo for something relational ● Talk to faculty about making more detailed information public ● Add functionality to UI (additional format exports, move to SPA) ● Improve cataloging in DSpace@MIT with lookup services

Editor's Notes

  1. Timeline of open access at MIT 2009 faculty vote growth of the collection over time rough figures on current size
  2. Five year anniversary Policy review called for Libraries wanted to contribute information to faculty to help inform the debate
  3. Five year anniversary Policy review called for Libraries wanted to contribute information to faculty to help inform the debate
  4. Five year anniversary Policy review called for Libraries wanted to contribute information to faculty to help inform the debate
  5. Five year anniversary Policy review called for Libraries wanted to contribute information to faculty to help inform the debate
  6. Harvard Libraries had unveiled MyDASH, which served as an inspiration to our early work
  7. Harvard Libraries had unveiled MyDASH, which served as an inspiration to our early work
  8. Harvard Libraries had unveiled MyDASH, which served as an inspiration to our early work
  9. Need to generate a pipeline diagram Start from Apache logs Filter out OA downloads Filter out bots Augment with author identities Augment with geo-referenced IP addresses Store in raw Mongo collection Generate summary collection via SOLR Visualize in UI https://github.com/MITLibraries/oastats-backend
  10. Need to generate a pipeline diagram Start from Apache logs Filter out OA downloads Filter out bots Augment with author identities Augment with geo-referenced IP addresses Store in raw Mongo collection Generate summary collection via SOLR Visualize in UI https://github.com/MITLibraries/oastats-backend
  11. Need to generate a pipeline diagram Start from Apache logs Filter out OA downloads Filter out bots Augment with author identities Augment with geo-referenced IP addresses Store in raw Mongo collection Generate summary collection via SOLR Visualize in UI https://github.com/MITLibraries/oastats-backend
  12. Need to generate a pipeline diagram Start from Apache logs Filter out OA downloads Filter out bots Augment with author identities Augment with geo-referenced IP addresses Store in raw Mongo collection Generate summary collection via SOLR Visualize in UI https://github.com/MITLibraries/oastats-backend
  13. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  14. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  15. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  16. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  17. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  18. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  19. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  20. Maybe repeated diagrams, adding sections as the pipeline got more complicated? Screenshots of OpenRefine, or of DSpace@MIT showing wrong data?
  21. The
  22. Need to generate a UI graphic
  23. Need to generate a UI graphic
  24. Need to generate a UI graphic
  25. Need to generate a UI graphic
  26. Need to generate a UI graphic
  27. Need to generate a UI graphic
  28. The latest action has been to send an email to all authors represented in the collection, inviting them to view the information about the downloads of their papers.
  29. This is a sample email, providing basic information about how many papers the author has in the collection, and some summary statistics about their downloads.
  30. This messaging was successful in driving a lot of traffic to the platform.
  31. There are also some kinks to be worked out about what email addresses we use.
  32. The feedback we’ve received from faculty and administrators about this project has been almost entirely positive.
  33. Project goals were met
  34. There are also some kinks to be worked out about what email addresses we use.
  35. There are also some kinks to be worked out about what email addresses we use.