SlideShare a Scribd company logo
1 of 21
Download to read offline
Mendeley's Data and
Perspectives on Data
          Challenges




          Kris Jack, PhD
         Chief Data Scientist
   https://twitter.com/_krisjack
Overview

➔
    What's Mendeley?

➔
    Why Run Challenges?

➔
    Mendeley's Challenges

➔
    Conclusions
What's Mendeley?
➔
    Mendeley is a platform that connects
    researchers, research data and apps




                         Mendeley Open API


➔
    How are we building our community?
Mendeley provides tools to help users...


...organise
their research



                                              ➔
                                                  Reference
                                                  management

                                              ➔
                                                  Cite-as-you-
                                                  write

                                              ➔
                                                  Full-text
                                                  article search

                                              ➔
                                                  Digitalised
                                                  annotations
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise
their research




                                         ➔
                                             Professional
                                             research groups

                                         ➔
                                             Social network

                                         ➔
                                             Annotation
                                             sharing
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                                ...discover new
their research                                    research




                                       ➔
                                           Personalised article
                                           recommendations

                                       ➔
                                           Related research

                                       ➔
                                           Research contact
                                           suggestions
Our community from a data perspective




Social network                          Personal libraries
 (~2M users)                            (~300M articles)




 Research groups                     Research catalogue
 (~175K groups)                    (~50M unique articles)
Why Run
Challenges?
Why Run Challenges?
➔
    An important part of our mission is to make science more open
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                “All the time we are very
                                conscious of the huge challenges
                                that human society has now –
                                curing cancer, understanding
                                the brain for Alzheimer‘s [...].
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                “All the time we are very
                                conscious of the huge challenges
                                that human society has now –
                                curing cancer, understanding
                                the brain for Alzheimer‘s [...].

                                But a lot of the state of knowledge
                                of the human race is sitting in the
                                scientists’ computers, and is
                                currently not shared […] We need
                                to get it unlocked so we can tackle
                                those huge problems.“
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                  “All the time we are very
                                  conscious of the huge challenges
                                  that human society has now –
                                  curing cancer, understanding
                                  the brain for Alzheimer‘s [...].

                                  But a lot of the state of knowledge
                                  of the human race is sitting in the
➔
    We run challenges that        scientists’ computers, and is
    aim to open up science        currently not shared […] We need
                                  to get it unlocked so we can tackle
➔
    Your skills in information    those huge problems.“
    sciences are valuable to us
Mendeley's
Challenges
PloS/Mendeley's Binary Battle

Challenge: Build an application with our data,
           make science more open.


Results:




             More details at http://dev.mendeley.com/api-binary-battle/
ScienceRec Challenge 2012

Challenge: Build off-line system for scientific
           recommendations with our API
           and DataTEL data set


Results:     Will discuss today
             How to improve for the future?




50K users, with at least
   20 articles each

             More details at http://2012.recsyschallenge.com/tracks/sciencerec/
Conclusions
Conclusions
➔
    Mendeley makes tools that help researchers to:
    ➔
        organise their research
    ➔
        collaborate with one another
    ➔
        discover new research
➔
    We are crowdsourcing a wealth of research data
➔
    We're opening it up to the world
➔
    And inviting you to participate
We're Hiring!
➔
    Data Scientist
    ➔
        apply recommender technologies to Mendeley's data
    ➔
        work on improving the quality of Mendeley's research catalogue
    ➔
        starting in first quarter of 2013
    ➔
        6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7
        TEAM project (http://team-project.tugraz.at/)
➔
    http://www.mendeley.com/careers/
www.mendeley.com
A Challenge for the Future?

Challenge: Investigate how well algorithms
           perform in real-world settings

Motivation: Industry repeatedly finds that
            aggressive A/B testing is required
            because offline improvements do
            not necessarily translate to online
            improvements

Problem:    Academia tends not to have access
            to large online communities           Research groups
                                                  (~175K groups)
Solution:   Industry runs A/B test with
            academic algorithms and reports
            results

What about privacy?
  Use publicly available data
  Anonymise and aggregate results reported

More Related Content

Viewers also liked

Community school of music and arts project 1
Community school of music and arts project 1Community school of music and arts project 1
Community school of music and arts project 1
gsk8er1925
 

Viewers also liked (13)

Anne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska SlideshowAnne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska Slideshow
 
Planejamento do texto
Planejamento do textoPlanejamento do texto
Planejamento do texto
 
Machine Learning @ Mendeley
Machine Learning @ MendeleyMachine Learning @ Mendeley
Machine Learning @ Mendeley
 
improving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesimproving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similarities
 
(2) ben 10_rounding[1]
(2) ben 10_rounding[1](2) ben 10_rounding[1]
(2) ben 10_rounding[1]
 
Community school of music and arts project 1
Community school of music and arts project 1Community school of music and arts project 1
Community school of music and arts project 1
 
Mendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersMendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchers
 
From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...
 
Mendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scaleMendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scale
 
Cloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyCloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from Mendeley
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Mendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemMendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender System
 
Mendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureMendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic Literature
 

Similar to Mendeley's Data and Perspectives on Data Challenges

Designing and using group software through patterns
Designing and using group software through patternsDesigning and using group software through patterns
Designing and using group software through patterns
Kyle Mathews
 
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
William Gunn
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!
William Gunn
 
Knowledge Worker 20562
Knowledge Worker 20562Knowledge Worker 20562
Knowledge Worker 20562
npasha
 
Littlejohn mooc collective_final
Littlejohn  mooc collective_finalLittlejohn  mooc collective_final
Littlejohn mooc collective_final
Colin Milligan
 
Social Media
Social MediaSocial Media
Social Media
psllc
 
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
William Gunn
 
peer review as an extension of bioinformatics
peer review as an extension of bioinformaticspeer review as an extension of bioinformatics
peer review as an extension of bioinformatics
mlincol2
 

Similar to Mendeley's Data and Perspectives on Data Challenges (20)

Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyMahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
 
Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical Perspective
 
Designing and using group software through patterns
Designing and using group software through patternsDesigning and using group software through patterns
Designing and using group software through patterns
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Cat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementCat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project Management
 
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
 
Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!
 
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaMendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
 
Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008
 
Design camp slides_landgren
Design camp slides_landgrenDesign camp slides_landgren
Design camp slides_landgren
 
Knowledge Worker 20562
Knowledge Worker 20562Knowledge Worker 20562
Knowledge Worker 20562
 
UX and Social Justice Workshop
UX and Social Justice  Workshop UX and Social Justice  Workshop
UX and Social Justice Workshop
 
Littlejohn mooc collective_final
Littlejohn  mooc collective_finalLittlejohn  mooc collective_final
Littlejohn mooc collective_final
 
SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...
 
Social Media
Social MediaSocial Media
Social Media
 
Supporting Designers to develop Innovative Products
Supporting Designers to develop Innovative ProductsSupporting Designers to develop Innovative Products
Supporting Designers to develop Innovative Products
 
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
 
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
 
peer review as an extension of bioinformatics
peer review as an extension of bioinformaticspeer review as an extension of bioinformatics
peer review as an extension of bioinformatics
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Mendeley's Data and Perspectives on Data Challenges

  • 1. Mendeley's Data and Perspectives on Data Challenges Kris Jack, PhD Chief Data Scientist https://twitter.com/_krisjack
  • 2. Overview ➔ What's Mendeley? ➔ Why Run Challenges? ➔ Mendeley's Challenges ➔ Conclusions
  • 4. Mendeley is a platform that connects researchers, research data and apps Mendeley Open API ➔ How are we building our community?
  • 5. Mendeley provides tools to help users... ...organise their research ➔ Reference management ➔ Cite-as-you- write ➔ Full-text article search ➔ Digitalised annotations
  • 6. Mendeley provides tools to help users... ...collaborate with one another ...organise their research ➔ Professional research groups ➔ Social network ➔ Annotation sharing
  • 7. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research ➔ Personalised article recommendations ➔ Related research ➔ Research contact suggestions
  • 8. Our community from a data perspective Social network Personal libraries (~2M users) (~300M articles) Research groups Research catalogue (~175K groups) (~50M unique articles)
  • 10. Why Run Challenges? ➔ An important part of our mission is to make science more open
  • 11. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].
  • 12. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...]. But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“
  • 13. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...]. But a lot of the state of knowledge of the human race is sitting in the ➔ We run challenges that scientists’ computers, and is aim to open up science currently not shared […] We need to get it unlocked so we can tackle ➔ Your skills in information those huge problems.“ sciences are valuable to us
  • 15. PloS/Mendeley's Binary Battle Challenge: Build an application with our data, make science more open. Results: More details at http://dev.mendeley.com/api-binary-battle/
  • 16. ScienceRec Challenge 2012 Challenge: Build off-line system for scientific recommendations with our API and DataTEL data set Results: Will discuss today How to improve for the future? 50K users, with at least 20 articles each More details at http://2012.recsyschallenge.com/tracks/sciencerec/
  • 18. Conclusions ➔ Mendeley makes tools that help researchers to: ➔ organise their research ➔ collaborate with one another ➔ discover new research ➔ We are crowdsourcing a wealth of research data ➔ We're opening it up to the world ➔ And inviting you to participate
  • 19. We're Hiring! ➔ Data Scientist ➔ apply recommender technologies to Mendeley's data ➔ work on improving the quality of Mendeley's research catalogue ➔ starting in first quarter of 2013 ➔ 6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7 TEAM project (http://team-project.tugraz.at/) ➔ http://www.mendeley.com/careers/
  • 21. A Challenge for the Future? Challenge: Investigate how well algorithms perform in real-world settings Motivation: Industry repeatedly finds that aggressive A/B testing is required because offline improvements do not necessarily translate to online improvements Problem: Academia tends not to have access to large online communities Research groups (~175K groups) Solution: Industry runs A/B test with academic algorithms and reports results What about privacy? Use publicly available data Anonymise and aggregate results reported