Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using MongoDB for the Art Genome Project (Mongo Boston 2011)

1,444 views

Published on

Using MongoDB for the Art Genome Project, presented 10/3/2011 at Mongo Boston.

Published in: Technology
  • Be the first to comment

Using MongoDB for the Art Genome Project (Mongo Boston 2011)

  1. 1. Happiness isMongoDBMonday, October 3rd, 2011<br />Daniel Doubrovkine (dB.)db@art.sy@dblockdotorghttp://code.dblock.org<br />902 Broadway, 4th Fl.New York, NY<br />
  2. 2. Claude Monet <br />Mark Grotjahn<br />Demo<br />
  3. 3. Art Genome Project<br />- vs. -<br />Kasimir Malevich – “Self Portrait”<br />William Beckman – “Self Portrait”<br />
  4. 4. Euclidean Distance<br />𝑑=(100−100)2+(100−50)2+…+(75−20)2 = 42<br /> <br />
  5. 5. Fast Search in Ruby<br />defsimilar(a1)<br />artworks.each { |a2|<br /> [a2, euclidean(a1, a2)] }.sort_by { |a, d|<br /> d <br /> }.take(10)<br />end<br />
  6. 6. MySQL Prototype Schema<br />
  7. 7. MySQL Prototype Schema<br />Need a sorted sparse vector on boot.<br />[ 100, 0, 20, … 60 ]<br />10K artworks: 5 minutes to startup<br />5 minutes to accomplish … nothing.<br />
  8. 8. Genome.genes – it’s a hash!<br />{ “Portrait” => 100, …, “Conceptual” => 20 }<br />Genome, Embedded in Artwork<br />MonoDB “Schema”<br />
  9. 9. Something new? Got (far too) many years of experience with *SQL / DW<br />@harryh uses it @ 4sq<br />@eliothorowitzlooks pretty smart<br />db.startups.find({ location : { $near : GA }, category : ‘nosqldb vendor' } ).first = 10gen<br />install … ? … profit<br />available on Heroku from MongoHQ<br />continuous deployment friendly<br />Choosing MongoDB<br />
  10. 10. Using MongoDB<br />MongoDB retrieval by ID is fast, maybe faster, than Ruby Hash<br />Using Rails + Rake and Mongo is safer than mongo shell db.collection.update({x: y})<br />Shared Hosting is not Rubber, You Can’t Stretch It<br />Map/reduce for live queries really doesn’t work, no reallymongoid_fulltext<br />Read-secondary + Map/Reduce can be fun read_secondary: <%= $rails_rake_task.nil? or !$rails_rake_task %><br />Collection names are limited in length if you use mongodumphttps://jira.mongodb.org/browse/SERVER-2973, fixed in 2.0.0<br />copyDatabaserequires administrative privilegeshttps://jira.mongodb.org/browse/SERVER-2846<br />
  11. 11. Mongo cursors aren’t snapshotted by defaultProcessing 5183 of 4012 …http://www.mongodb.org/display/DOCS/How+to+do+Snapshotted+Queries+in+the+Mongo+Database<br />Mongo Interest is growing, RoR + MongoId = GTDhttp://code.dblock.org/ror-win-getting-things-done-with-mongodb-mongoid<br />Mongoid Keeps Things Entertaining, Living on the Edge<br />Using MongoDB(continued …)<br />
  12. 12. MongoHQ Extensions via Heroku<br />Production Directly w/MongoHQ<br />A Few Hundred Bucks / mo.<br />Mongo 1.8.1 w/ replica sets, 2 DBs and 1 arbiter<br />Different Availability Zones<br />Dedicated RAM, separate EBS, shared CPU<br />Early Issues, Now Very Stable<br />Jason McCay + other folks @ MongoHQ= Awesome<br />Mongoid2.0.2<br />mongoid_slug, mongoid_fulltext, mongoid_history, delayed_job_mongoid<br />Deploying MongoDB<br />
  13. 13. name: Daniel Doubrovkine(aka. dB.)<br />company: http://art.sy<br />^work heretwitter: @dblockdotorg<br /> blog: http://code.dblock.org ^link to slides here<br /> email: dblock@dblock.org<br />Thank you.<br />

×