Your SlideShare is downloading. ×
0
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
AWS Customer Presentation - Zanran and AWS
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

AWS Customer Presentation - Zanran and AWS

1,775

Published on

Jon Goldhill from Zanran talks about running a search engine on AWS

Jon Goldhill from Zanran talks about running a search engine on AWS

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,775
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Zanran<br />- a tale<br />1<br />
  • 2. people<br />Jon Goldhill<br />Yves Dassas<br />MBA from London Business School<br />Started telecom information business in 1987<br />PhD in electrochemistry<br />Started voice processing business in 1995<br />Built international telephony business together<br />2<br />
  • 3. 3<br />
  • 4. 4<br />
  • 5. 5<br />
  • 6. 6<br />..While in early beta, this is a pretty exciting place for a data junkie.  <br />www.clearhonestdata.com 12 May 2011<br />Is Zanran Any Good?<br />Short answer: For some queries, yes, Zanran is quite good. Almost scarily so, actually. <br />http://SearchEngineLand.com 12 May 2011<br />I don't usually post on non-patent or other IP matters, but I'm making an exception for a valuable search engine that should be used when it's data rather than words that you are looking for....<br />Steve van Dulken’s blog on Patents and IP, 13 Aug 2011<br />
  • 7. 7<br />How did we get here?<br />
  • 8. Image classification (filtering)<br />8<br />
  • 9. Difficult!<br />9<br />
  • 10. 10<br />Started fundraising, January 2008<br />Gave up on fundraising, May 2008<br />But... introduction to AWS user<br />
  • 11. 11<br />in office<br />in datacentre<br />Amazon cloud<br /><ul><li>Easy to maintain
  • 12. Very limited in scale
  • 13. Familiar
  • 14. Expensive – machines and space
  • 15. Committing
  • 16. Cheap to experiment
  • 17. Scaleable
  • 18. Avoid purchase errorsbut: unfamiliar, Linux</li></li></ul><li>12<br />Zanran front end <br />– what the users interact with<br />Webserver<br />Solr<br />ec2: High-Memory Extra Large Instance<br />users<br />Storage<br />on S3<br />
  • 19. 13<br />Zanran back end <br />– batch processing<br />Crawl the internet<br />Stage 1<br />Stage 2<br />Stage 3<br />Stage 4<br />new PDF, Excel, etc<br />Amazon RDS<br />Image processing<br />Is this a graph?<br />Text extraction<br />Find a title + other useful text<br />index<br />Solr<br />2<br />
  • 20. 14<br />scale<br />Crawling<br />Image processing<br />Text extraction<br />Re-indexing <br />10 small instances<br />300 small instances<br />20 small instances<br />1 extra large instance<br />
  • 21. 15<br />reliability<br />Solr: 6 months<br />RDS: 19 months<br />S3: 100m+ files stored<br />
  • 22. Benefits from using Amazon<br />scaleability – from 3 to 303 servers<br />scaleability – from 7 to 17GB RAM<br />flexibility – Solr development servers<br />‘ecosystem’ – RightScale, forums<br />lower capital and operations costs<br />16<br />
  • 23. office dog<br />not present today<br />17<br />

×