• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Mendeley, putting data into the hands of researchers
 

Mendeley, putting data into the hands of researchers

on

  • 882 views

I was invited to give a keynote presentation at the RecSysTEL Workshop (http://bit.ly/b2Bg2J) on 2010/09/30....

I was invited to give a keynote presentation at the RecSysTEL Workshop (http://bit.ly/b2Bg2J) on 2010/09/30.

It presents Mendeley's tools for researchers and data sets that we made available for the dataTEL challenge, designed to provide new large scale data for researcers in recommendation systems.

The event was really enjoyable and the participants were excited about Mendeley.

Statistics

Views

Total Views
882
Views on SlideShare
882
Embed Views
0

Actions

Likes
1
Downloads
8
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Mendeley, putting data into the hands of researchers Mendeley, putting data into the hands of researchers Presentation Transcript

    • Mendeley, putting data into the hands of researchers Kris Jack, PhDData Mining Team Coordinator
    • “All the time we are veryconscious of the hugechallenges that humansociety has now – curingcancer, understanding thebrain for Alzheimer‘s [...].But a lot of the state ofknowledge of the human raceis sitting in the scientists’computers, and is currentlynot shared […] We need toget it unlocked so we cantackle those huge problems.“
    • Summary➔ idea behind mendeley➔ our features➔ our technical challenges and solutions➔ what does this mean for you?
    • Mendeley Last.fm 3) Last.fm builds your music works like this: profile and recommends you music you also could like...1) Install “Audioscrobbler” and it’s the world‘s biggest open music database 2) Listen to music
    • Mendeley Last.fmmusic libraries research librariesartists researcherssongs papersgenres disciplines
    • Summary➔ idea behind mendeley➔ our features➔ our technical challenges and solutions➔ what does this mean for you?
    • Mendeley helps researchers work smarter
    • Mendeley helps researchers work smarterInstallMendeley Desktop Mendeley extracts research data..
    • Mendeley helps researchers work smarter ..and aggregates research data in the cloud Mendeley extracts research data..
    • By doing this, Mendeley makes sciencemore collaborative and transparent
    • Summary➔ idea behind mendeley➔ our features➔ our technical challenges andsolutions➔ what does this mean for you?
    • 500,000+ users; the 20 largest userbases: University of Cambridge Stanford University MIT University of Michigan Harvard University University of Oxford Sao Paulo University Imperial College London University of Edinburgh Cornell University University of California at Berkeley RWTH Aachen Columbia University Georgia Tech University of Wisconsin UC San Diego39,000,000+ articles University of California at LA University of Florida University of North Carolina
    • we can only use algorithms that scale upreadership statistics search most frequent tags related research + dozens of other services
    • most frequent tags on our scalereadership statistics search most frequent tags related research
    • most frequent tags on our scale most frequent tags called 39,000,000 times for each document for each tag in document increment count for tagcalled ~3 times sort tags by frequency called ~39,000,000 x 3 = ~117,000,000 times
    • solution: distributed computing map reduce for each document for each tag in document increment count for tag sort tags by frequency for each tag counted emit the tag and frequency MapReduce: Simplified Data Processing on Large Clusters In Proceedings of OSDI 2004, San Francisco, CA, 2004. Jeffrey Dean and Sanjay Ghemawat
    • solution: distributed computing hadoop MapReduce: Simplified Data Processing on Large Clusters In Proceedings of OSDI 2004, San Francisco, CA, 2004. Jeffrey Dean and Sanjay Ghemawat
    • support vector machineshidden markov models
    • conditional random fields Isaac G. Councill, C. Lee Giles, Min-Yen Kan. (2008) ParsCit: An open-source CRF reference string parsing package. In Proceedings of the LREC 08, Marrakesh, Morrocco.
    • deduplication crowd sourcing new articles from users collapse metadata and update canonical docs file hash check metadata comparison document fingerprinting 39,000,000 canonical documents
    • statistics pig
    • readerrank
    • currently tf-idf similarity between documentsdeveloping collaborative filtering
    • contact recommendations currently recommendations based on contact network developing version based on interests
    • Summary➔ idea behind mendeley➔ our features➔ our technical challenges and solutions➔ what does this mean for you?
    • access to data
    • online catalogdatatel data set online article view logs article tags library readership library stars
    • Mendeleys API
    • *new* you can get all of the articles in a group - data for you to test related research algos?
    • Mendeleys API Mashups with data on: Chemical compounds Locations Alzheimer’s research Grant funding Twitter streams
    • want more? let us know...
    • “All the time we are veryconscious of the hugechallenges that humansociety has now – curingcancer, understanding thebrain for Alzheimer‘s [...].But a lot of the state ofknowledge of the human raceis sitting in the scientists’computers, and is currentlynot shared […] We need toget it unlocked so we cantackle those huge problems.“
    • www.mendeley.com were hiring!