Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Amazon Web Services at Mendeley

2,753 views

Published on

An overview of how we use Amazon web services at Mendeley. I go into more details on generating pdf previews for our site of 13TB of files, and scaling solr search to handle variable load on EC2.

Published in: Technology
  • Be the first to comment

Amazon Web Services at Mendeley

  1. 1. Amazon Web ServicesatMendeleyDan HarveyData Architecttwitter: @danharveydan.harvey@mendeley.com
  2. 2. Overview• What do we do?• System design• AWS details• Future plans• Summary
  3. 3. Mendeley helps researchers work smarter
  4. 4. Mendeley helps researchers work smarter1) InstallMendeley Desktop
  5. 5. Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic data extraction 2) Manage your research papers
  6. 6. Mendeley helps researchers work smarter1) InstallMendeley Desktop External database integration 2) Manage your research papers
  7. 7. Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic bibliography generation 2) Manage your research papers
  8. 8. Mendeley helps researchers work smarter1) InstallMendeley Desktop Tagging and annotation 2) Manage your research papers
  9. 9. Mendeley helps researchers work smarter 3) Mendeley aggregates research data in the cloud1) InstallMendeley Desktop 2) Manage your research papers
  10. 10. By doing this, Mendeley makes science morecollaborative and transparent
  11. 11. Mendeley in numbers• 1 million users• 130 million research articles• 40 million unique• 14 million unique files uploaded• 13 TB in total
  12. 12. System Overview S3 ng Amazon Web Web Web S ynci Services Server ServerEM R Brow sing Docs EC 2 Usage Logs MySQL MySQL MySQL Da ta S erv ice s Map Reduce HB ase HD FS
  13. 13. File Storage• Sync to and from clients –Backed onto S3• How to render 13TB of pdfs?
  14. 14. PDF Previews• Elastic Beanstalk• Java servlet –Load & render –Store into S3• Quick to prototype –Fast iterations –No infrastructure to set up © Elas%c
Beanstalk,
Ma/
Wood,
AWS,
2011 –Developers in control –No upfront cost in hardware• No dependency on rest of our infrastructure
  15. 15. Adapt to take advantage• Improve delivery –Cloud Front –Faster worldwide• Re-working for cost saving –SQS –Spot instances –Render when it’s cheapest!
  16. 16. Article Search• 40 million papers• Gives 40GB index in Solr• Variable load• Moved to EC2 –Elastic Load Balancer Two
fold
variance
in
traffic
over
a
week –Auto-scale instances
  17. 17. Solr Instance Layout• Master Solr –Single instance Master –Matched to indexing load –Backed onto EBS Solr Solr Solr Slave Slave Slave• Slaves –HTTP sync to master –Pre-built AMI images Elastic Load Balancer –EC2 auto scaling
  18. 18. Desktop Client• Client Downloads –From S3 –Adding CloudFront• Crash Reports –Stack traces into S3 –Analytic reports on top –More focused bug fixing
  19. 19. The future• Aim to buy no more hardware• More Java on Elastic Beanstalk• SQS - replace queues• EMR - log analysis• SimpleDB & S3 for data stores
  20. 20. Problems Faced• Accounting usage –Mix of users on account –Start early with this! –IAM helps• Orchestration –Cloud Formation –Elastic Beanstalk –Finding we need more
  21. 21. Summary• Not all or nothing• Focus on your problem not “Undifferentiated heavy lifting” - Werner Vogels• Learn the building blocks provided• Modular system design helps
  22. 22. Mendeley Binary Battle• $10,001 prize + $1000 aws vouchers• Collaboration with PLoS• Prizes to best use of the API• Judging panel includes –Werner Vogels –Tim OReilly
  23. 23. We’re hiring http://mendeley.com/careers/ or chat to me after• Lead Mobile Developer, iOS• Web Developer, PHP/MySQL• Software Engineer, Java

×