SlideShare a Scribd company logo
1 of 21
Download to read offline
When?
                            Why?
                What?

                               Flavio [FlaPer87] Percoco Premoli
                                         flaper87@flaper87.org
                                             twitter: @flaper87
domingo 8 de mayo de 2011
When?
domingo 8 de mayo de 2011
Dictionaries!
                            When?
domingo 8 de mayo de 2011
Spidering!
    Dictionaries!
                                     When?
domingo 8 de mayo de 2011
Statistics!
                            Spidering!
    Dictionaries!
                                     When?
domingo 8 de mayo de 2011
Queues!
         Statistics!
                            Spidering!
    Dictionaries!
                                     When?
domingo 8 de mayo de 2011
Logging!
                                             Queues!
         Statistics!
                                       Spidering!
    Dictionaries!
                                                When?
domingo 8 de mayo de 2011
Why?
domingo 8 de mayo de 2011
* Unstructured Data! (Spidering)


                                                         Why?
domingo 8 de mayo de 2011
* Lot of reads! (Dictionaries, Queues)
                      * Unstructured Data! (Spidering)


                                                         Why?
domingo 8 de mayo de 2011
* [JB]son like Document Oriented API (All)
              * Lot of reads! (Dictionaries, Queues)
                      * Unstructured Data! (Spidering)


                                                         Why?
domingo 8 de mayo de 2011
* Lot of writes! (Logging, Statistics, Queues)
        * [JB]son like Document Oriented API (All)
              * Lot of reads! (Dictionaries, Queues)
                      * Unstructured Data! (Spidering)


                                                         Why?
domingo 8 de mayo de 2011
* Make sure you create the right indexes


     # lets get our collection
     collection = connection['dictionaries']['it']

     def insert_word(word, data):
         collection.update({'word' : word}, data, upsert=True)




                                                        What?
domingo 8 de mayo de 2011
* Make sure you create the right indexes


     # lets get our collection
     collection = connection['dictionaries']['it']

     # lets ensure there’s an index for the key word
     collection.ensure_index([("word", pymongo. ASCENDING)])

      def insert_word(word, data):
          collection.update({'word' : word}, data, upsert=True)




                                                         What?
domingo 8 de mayo de 2011
* Make sure you save what you really need

  def parse(response):
      url_netloc = urlparse.urlsplit(response.url).netloc
      crawled = {
                  "url" : response.url,
                  "base_url" : url_netloc,
                  "content" : response.body_as_unicode(),
                  "status" : response.status,
                  "encoding" : response.encoding,
                  "headers" : response.headers,
                  "lastcrawl" : time.time(),
          }

           collection.update({'url' : response.url}, crawled, True)


                                                            What?
domingo 8 de mayo de 2011
* Make sure you understand that schemaless != mess


 logs = [
   {'url' : "http://www.google.com", "time" : 1304336526.011287},
   {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 }
 ]

 def insert_log()
     for log in logs:
         collection.insert(log)




                                                             What?
domingo 8 de mayo de 2011
* Make sure you understand that schemaless != mess

logs = [
    {'url' : "http://www.google.com", "time" : 1304336526.011287},
    {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 }
]

def insert_log()
    for log in logs:
        log_to_insert = {
            "url" : log.get('url', log.get('address')),
            "time" : log.get('time')
        }
        collection.insert(log_to_insert)



                                                             What?
domingo 8 de mayo de 2011
* “Relate” what you occasionally need, “Embed” what you always need


                            message = {
                                'msg' : "This is a test message",
                                'time' : time.time(),
                                'user' : {
                                    'username' : 'flaper87',
                                    'email' : 'flaper87@flaper87.org',
                                }
                            }




                                                                   What?
domingo 8 de mayo de 2011
* ObjectIDs have an embedded datetime

     def _get(self, queue):
             try:
                  msg = self.client.database.command("findandmodify",
                          "messages",
                          query={"queue": queue},
                          sort={"_id": pymongo.ASCENDING}, remove=True)
             except errors.OperationFailure, exc:
                  if "No matching object found" in exc.args[0]:
                      raise Empty()
                  raise
             return deserialize(msg["value"]["payload"])



                                                         What?
domingo 8 de mayo de 2011
Lets talk about mongoDB!!


                                       Thanks!!

domingo 8 de mayo de 2011
Thanks!!

           Lets talk about mongoDB!!              Thanks 10gen!!


domingo 8 de mayo de 2011

More Related Content

Viewers also liked

TEN Digital - Site blog - PT
TEN Digital - Site blog - PTTEN Digital - Site blog - PT
TEN Digital - Site blog - PTOctavio Pitaluga
 
OpenStack: A python based IaaS provider
OpenStack: A python based IaaS providerOpenStack: A python based IaaS provider
OpenStack: A python based IaaS providerFlavio Percoco Premoli
 
TEN Digital - Google Analytics - EN
TEN Digital - Google Analytics - ENTEN Digital - Google Analytics - EN
TEN Digital - Google Analytics - ENOctavio Pitaluga
 
Morphia: Simplifying Persistence for Java and MongoDB
Morphia:  Simplifying Persistence for Java and MongoDBMorphia:  Simplifying Persistence for Java and MongoDB
Morphia: Simplifying Persistence for Java and MongoDBJeff Yemin
 

Viewers also liked (8)

TEN Digital - Official
TEN Digital - OfficialTEN Digital - Official
TEN Digital - Official
 
Glance wants to go public
Glance wants to go publicGlance wants to go public
Glance wants to go public
 
TEN Portfolio EN
TEN Portfolio ENTEN Portfolio EN
TEN Portfolio EN
 
TEN Digital - Site blog - PT
TEN Digital - Site blog - PTTEN Digital - Site blog - PT
TEN Digital - Site blog - PT
 
OpenStack: A python based IaaS provider
OpenStack: A python based IaaS providerOpenStack: A python based IaaS provider
OpenStack: A python based IaaS provider
 
TEN Digital - Google Analytics - EN
TEN Digital - Google Analytics - ENTEN Digital - Google Analytics - EN
TEN Digital - Google Analytics - EN
 
Morphia: Simplifying Persistence for Java and MongoDB
Morphia:  Simplifying Persistence for Java and MongoDBMorphia:  Simplifying Persistence for Java and MongoDB
Morphia: Simplifying Persistence for Java and MongoDB
 
Events
EventsEvents
Events
 

Similar to When?, Why? and What? of MongoDB

international PHP2011_ilia alshanetsky_Hidden Features of PHP
international PHP2011_ilia alshanetsky_Hidden Features of PHPinternational PHP2011_ilia alshanetsky_Hidden Features of PHP
international PHP2011_ilia alshanetsky_Hidden Features of PHPsmueller_sandsmedia
 
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)jeremymcanally
 
Doctrine in the Real World
Doctrine in the Real WorldDoctrine in the Real World
Doctrine in the Real WorldJonathan Wage
 
Advanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRAdvanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRRobert Treat
 
잘 알려지지 않은 Php 코드 활용하기
잘 알려지지 않은 Php 코드 활용하기잘 알려지지 않은 Php 코드 활용하기
잘 알려지지 않은 Php 코드 활용하기형우 안
 
Developing a Language
Developing a LanguageDeveloping a Language
Developing a LanguageEngine Yard
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)OpenBlend society
 
Rendering Views in JavaScript - "The New Web Architecture"
Rendering Views in JavaScript - "The New Web Architecture"Rendering Views in JavaScript - "The New Web Architecture"
Rendering Views in JavaScript - "The New Web Architecture"Jonathan Julian
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
Developing a Language
Developing a LanguageDeveloping a Language
Developing a Languageevanphx
 
Rapid web development using tornado web and mongodb
Rapid web development using tornado web and mongodbRapid web development using tornado web and mongodb
Rapid web development using tornado web and mongodbikailan
 
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...Alexandre Porcelli
 
Node js techtalksto
Node js techtalkstoNode js techtalksto
Node js techtalkstoJason Diller
 
Aula 2,3 e 4 Publicidade Online
Aula 2,3 e 4 Publicidade OnlineAula 2,3 e 4 Publicidade Online
Aula 2,3 e 4 Publicidade OnlineKarina Rocha
 
Doctrator Symfony Live 2011 Paris
Doctrator Symfony Live 2011 ParisDoctrator Symfony Live 2011 Paris
Doctrator Symfony Live 2011 Parispablodip
 
WebShell - confoo 2011 - sean coates
WebShell - confoo 2011 - sean coatesWebShell - confoo 2011 - sean coates
WebShell - confoo 2011 - sean coatesBachkoutou Toutou
 
Koss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsKoss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsEvil Martians
 

Similar to When?, Why? and What? of MongoDB (20)

international PHP2011_ilia alshanetsky_Hidden Features of PHP
international PHP2011_ilia alshanetsky_Hidden Features of PHPinternational PHP2011_ilia alshanetsky_Hidden Features of PHP
international PHP2011_ilia alshanetsky_Hidden Features of PHP
 
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)
Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)
 
Doctrine in the Real World
Doctrine in the Real WorldDoctrine in the Real World
Doctrine in the Real World
 
Advanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRAdvanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITR
 
잘 알려지지 않은 Php 코드 활용하기
잘 알려지지 않은 Php 코드 활용하기잘 알려지지 않은 Php 코드 활용하기
잘 알려지지 않은 Php 코드 활용하기
 
Developing a Language
Developing a LanguageDeveloping a Language
Developing a Language
 
JavaSE 7
JavaSE 7JavaSE 7
JavaSE 7
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
 
Rendering Views in JavaScript - "The New Web Architecture"
Rendering Views in JavaScript - "The New Web Architecture"Rendering Views in JavaScript - "The New Web Architecture"
Rendering Views in JavaScript - "The New Web Architecture"
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
Developing a Language
Developing a LanguageDeveloping a Language
Developing a Language
 
Rapid web development using tornado web and mongodb
Rapid web development using tornado web and mongodbRapid web development using tornado web and mongodb
Rapid web development using tornado web and mongodb
 
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
 
Node js techtalksto
Node js techtalkstoNode js techtalksto
Node js techtalksto
 
The small things
The small thingsThe small things
The small things
 
Aula 2,3 e 4 Publicidade Online
Aula 2,3 e 4 Publicidade OnlineAula 2,3 e 4 Publicidade Online
Aula 2,3 e 4 Publicidade Online
 
Doctrator Symfony Live 2011 Paris
Doctrator Symfony Live 2011 ParisDoctrator Symfony Live 2011 Paris
Doctrator Symfony Live 2011 Paris
 
What's Cooking in Xtext 2.0
What's Cooking in Xtext 2.0What's Cooking in Xtext 2.0
What's Cooking in Xtext 2.0
 
WebShell - confoo 2011 - sean coates
WebShell - confoo 2011 - sean coatesWebShell - confoo 2011 - sean coates
WebShell - confoo 2011 - sean coates
 
Koss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsKoss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser apps
 

More from Flavio Percoco Premoli

More from Flavio Percoco Premoli (6)

Marconi: Queuing and Notification service for OpenStack
Marconi: Queuing and Notification service for OpenStackMarconi: Queuing and Notification service for OpenStack
Marconi: Queuing and Notification service for OpenStack
 
Introduction, deployment and hybrid clouds
Introduction, deployment and hybrid cloudsIntroduction, deployment and hybrid clouds
Introduction, deployment and hybrid clouds
 
OpenStack: Community driven development, For Real!
OpenStack: Community driven development, For Real!OpenStack: Community driven development, For Real!
OpenStack: Community driven development, For Real!
 
Django Mongodb Engine
Django Mongodb EngineDjango Mongodb Engine
Django Mongodb Engine
 
Mongodb in deep
Mongodb in deepMongodb in deep
Mongodb in deep
 
Aptspy
AptspyAptspy
Aptspy
 

When?, Why? and What? of MongoDB

  • 1. When? Why? What? Flavio [FlaPer87] Percoco Premoli flaper87@flaper87.org twitter: @flaper87 domingo 8 de mayo de 2011
  • 2. When? domingo 8 de mayo de 2011
  • 3. Dictionaries! When? domingo 8 de mayo de 2011
  • 4. Spidering! Dictionaries! When? domingo 8 de mayo de 2011
  • 5. Statistics! Spidering! Dictionaries! When? domingo 8 de mayo de 2011
  • 6. Queues! Statistics! Spidering! Dictionaries! When? domingo 8 de mayo de 2011
  • 7. Logging! Queues! Statistics! Spidering! Dictionaries! When? domingo 8 de mayo de 2011
  • 8. Why? domingo 8 de mayo de 2011
  • 9. * Unstructured Data! (Spidering) Why? domingo 8 de mayo de 2011
  • 10. * Lot of reads! (Dictionaries, Queues) * Unstructured Data! (Spidering) Why? domingo 8 de mayo de 2011
  • 11. * [JB]son like Document Oriented API (All) * Lot of reads! (Dictionaries, Queues) * Unstructured Data! (Spidering) Why? domingo 8 de mayo de 2011
  • 12. * Lot of writes! (Logging, Statistics, Queues) * [JB]son like Document Oriented API (All) * Lot of reads! (Dictionaries, Queues) * Unstructured Data! (Spidering) Why? domingo 8 de mayo de 2011
  • 13. * Make sure you create the right indexes # lets get our collection collection = connection['dictionaries']['it'] def insert_word(word, data): collection.update({'word' : word}, data, upsert=True) What? domingo 8 de mayo de 2011
  • 14. * Make sure you create the right indexes # lets get our collection collection = connection['dictionaries']['it'] # lets ensure there’s an index for the key word collection.ensure_index([("word", pymongo. ASCENDING)]) def insert_word(word, data): collection.update({'word' : word}, data, upsert=True) What? domingo 8 de mayo de 2011
  • 15. * Make sure you save what you really need def parse(response): url_netloc = urlparse.urlsplit(response.url).netloc crawled = { "url" : response.url, "base_url" : url_netloc, "content" : response.body_as_unicode(), "status" : response.status, "encoding" : response.encoding, "headers" : response.headers, "lastcrawl" : time.time(), } collection.update({'url' : response.url}, crawled, True) What? domingo 8 de mayo de 2011
  • 16. * Make sure you understand that schemaless != mess logs = [ {'url' : "http://www.google.com", "time" : 1304336526.011287}, {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 } ] def insert_log() for log in logs: collection.insert(log) What? domingo 8 de mayo de 2011
  • 17. * Make sure you understand that schemaless != mess logs = [ {'url' : "http://www.google.com", "time" : 1304336526.011287}, {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 } ] def insert_log() for log in logs: log_to_insert = { "url" : log.get('url', log.get('address')), "time" : log.get('time') } collection.insert(log_to_insert) What? domingo 8 de mayo de 2011
  • 18. * “Relate” what you occasionally need, “Embed” what you always need message = { 'msg' : "This is a test message", 'time' : time.time(), 'user' : { 'username' : 'flaper87', 'email' : 'flaper87@flaper87.org', } } What? domingo 8 de mayo de 2011
  • 19. * ObjectIDs have an embedded datetime def _get(self, queue): try: msg = self.client.database.command("findandmodify", "messages", query={"queue": queue}, sort={"_id": pymongo.ASCENDING}, remove=True) except errors.OperationFailure, exc: if "No matching object found" in exc.args[0]: raise Empty() raise return deserialize(msg["value"]["payload"]) What? domingo 8 de mayo de 2011
  • 20. Lets talk about mongoDB!! Thanks!! domingo 8 de mayo de 2011
  • 21. Thanks!! Lets talk about mongoDB!! Thanks 10gen!! domingo 8 de mayo de 2011

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n