Amazon Web ServicesatMendeleyDan HarveyData Architecttwitter: @danharveydan.harvey@mendeley.com
Overview• What do we do?• System design• AWS details• Future plans• Summary
Mendeley helps researchers work smarter
Mendeley helps researchers work smarter1) InstallMendeley Desktop
Mendeley helps researchers work smarter1) InstallMendeley Desktop                            Automatic data extraction    ...
Mendeley helps researchers work smarter1) InstallMendeley Desktop                            External database integration...
Mendeley helps researchers work smarter1) InstallMendeley Desktop                            Automatic bibliography genera...
Mendeley helps researchers work smarter1) InstallMendeley Desktop                            Tagging and annotation       ...
Mendeley helps researchers work smarter                        3) Mendeley aggregates research                            ...
By doing this, Mendeley makes science morecollaborative and transparent
Mendeley in numbers• 1 million users• 130 million research articles• 40 million unique• 14 million unique files uploaded• ...
System Overview     S3                                                                                                  ng...
File Storage• Sync to and from clients –Backed onto S3• How to render 13TB of pdfs?
PDF Previews• Elastic Beanstalk• Java servlet –Load & render –Store into S3• Quick to prototype –Fast iterations –No infra...
Adapt to take advantage• Improve delivery –Cloud Front –Faster worldwide• Re-working for cost saving –SQS –Spot instances ...
Article Search• 40 million papers• Gives 40GB index in Solr• Variable load• Moved to EC2 –Elastic Load Balancer           ...
Solr Instance Layout• Master                                         Solr –Single instance                       Master –M...
Desktop Client• Client Downloads –From S3 –Adding CloudFront• Crash Reports –Stack traces into S3 –Analytic reports on top...
The future• Aim to buy no more hardware• More Java on Elastic Beanstalk• SQS - replace queues• EMR - log analysis• SimpleD...
Problems Faced• Accounting usage –Mix of users on account –Start early with this! –IAM helps• Orchestration –Cloud Formati...
Summary• Not all or nothing• Focus on your problem       not “Undifferentiated heavy lifting”                             ...
Mendeley Binary Battle• $10,001 prize + $1000 aws vouchers• Collaboration with PLoS• Prizes to best use of the API• Judgin...
We’re hiring     http://mendeley.com/careers/             or chat to me after• Lead Mobile Developer, iOS• Web Developer, ...
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Amazon Web Services at Mendeley
Upcoming SlideShare
Loading in...5
×

Amazon Web Services at Mendeley

2,085

Published on

An overview of how we use Amazon web services at Mendeley. I go into more details on generating pdf previews for our site of 13TB of files, and scaling solr search to handle variable load on EC2.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,085
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
33
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • as\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Amazon Web Services at Mendeley

    1. 1. Amazon Web ServicesatMendeleyDan HarveyData Architecttwitter: @danharveydan.harvey@mendeley.com
    2. 2. Overview• What do we do?• System design• AWS details• Future plans• Summary
    3. 3. Mendeley helps researchers work smarter
    4. 4. Mendeley helps researchers work smarter1) InstallMendeley Desktop
    5. 5. Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic data extraction 2) Manage your research papers
    6. 6. Mendeley helps researchers work smarter1) InstallMendeley Desktop External database integration 2) Manage your research papers
    7. 7. Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic bibliography generation 2) Manage your research papers
    8. 8. Mendeley helps researchers work smarter1) InstallMendeley Desktop Tagging and annotation 2) Manage your research papers
    9. 9. Mendeley helps researchers work smarter 3) Mendeley aggregates research data in the cloud1) InstallMendeley Desktop 2) Manage your research papers
    10. 10. By doing this, Mendeley makes science morecollaborative and transparent
    11. 11. Mendeley in numbers• 1 million users• 130 million research articles• 40 million unique• 14 million unique files uploaded• 13 TB in total
    12. 12. System Overview S3 ng Amazon Web Web Web S ynci Services Server ServerEM R Brow sing Docs EC 2 Usage Logs MySQL MySQL MySQL Da ta S erv ice s Map Reduce HB ase HD FS
    13. 13. File Storage• Sync to and from clients –Backed onto S3• How to render 13TB of pdfs?
    14. 14. PDF Previews• Elastic Beanstalk• Java servlet –Load & render –Store into S3• Quick to prototype –Fast iterations –No infrastructure to set up © Elas%c
Beanstalk,
Ma/
Wood,
AWS,
2011 –Developers in control –No upfront cost in hardware• No dependency on rest of our infrastructure
    15. 15. Adapt to take advantage• Improve delivery –Cloud Front –Faster worldwide• Re-working for cost saving –SQS –Spot instances –Render when it’s cheapest!
    16. 16. Article Search• 40 million papers• Gives 40GB index in Solr• Variable load• Moved to EC2 –Elastic Load Balancer Two
fold
variance
in
traffic
over
a
week –Auto-scale instances
    17. 17. Solr Instance Layout• Master Solr –Single instance Master –Matched to indexing load –Backed onto EBS Solr Solr Solr Slave Slave Slave• Slaves –HTTP sync to master –Pre-built AMI images Elastic Load Balancer –EC2 auto scaling
    18. 18. Desktop Client• Client Downloads –From S3 –Adding CloudFront• Crash Reports –Stack traces into S3 –Analytic reports on top –More focused bug fixing
    19. 19. The future• Aim to buy no more hardware• More Java on Elastic Beanstalk• SQS - replace queues• EMR - log analysis• SimpleDB & S3 for data stores
    20. 20. Problems Faced• Accounting usage –Mix of users on account –Start early with this! –IAM helps• Orchestration –Cloud Formation –Elastic Beanstalk –Finding we need more
    21. 21. Summary• Not all or nothing• Focus on your problem not “Undifferentiated heavy lifting” - Werner Vogels• Learn the building blocks provided• Modular system design helps
    22. 22. Mendeley Binary Battle• $10,001 prize + $1000 aws vouchers• Collaboration with PLoS• Prizes to best use of the API• Judging panel includes –Werner Vogels –Tim OReilly
    23. 23. We’re hiring http://mendeley.com/careers/ or chat to me after• Lead Mobile Developer, iOS• Web Developer, PHP/MySQL• Software Engineer, Java
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×