I've been using mongodb for 2 years and many times I've faced myself asking "why should I use it for this?" or "when should I really use Mongodb?" and many other times "What did I do wrong?".
Experiences, examples and real use cases many times say things that benchmarks or technical documentation don't, that for, I'll be presenting the When, Why and What of mongodb. For real, that's what really matters.
10. * Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
Why?
domingo 8 de mayo de 2011
11. * [JB]son like Document Oriented API (All)
* Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
Why?
domingo 8 de mayo de 2011
12. * Lot of writes! (Logging, Statistics, Queues)
* [JB]son like Document Oriented API (All)
* Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
Why?
domingo 8 de mayo de 2011
13. * Make sure you create the right indexes
# lets get our collection
collection = connection['dictionaries']['it']
def insert_word(word, data):
collection.update({'word' : word}, data, upsert=True)
What?
domingo 8 de mayo de 2011
14. * Make sure you create the right indexes
# lets get our collection
collection = connection['dictionaries']['it']
# lets ensure there’s an index for the key word
collection.ensure_index([("word", pymongo. ASCENDING)])
def insert_word(word, data):
collection.update({'word' : word}, data, upsert=True)
What?
domingo 8 de mayo de 2011
15. * Make sure you save what you really need
def parse(response):
url_netloc = urlparse.urlsplit(response.url).netloc
crawled = {
"url" : response.url,
"base_url" : url_netloc,
"content" : response.body_as_unicode(),
"status" : response.status,
"encoding" : response.encoding,
"headers" : response.headers,
"lastcrawl" : time.time(),
}
collection.update({'url' : response.url}, crawled, True)
What?
domingo 8 de mayo de 2011
16. * Make sure you understand that schemaless != mess
logs = [
{'url' : "http://www.google.com", "time" : 1304336526.011287},
{'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 }
]
def insert_log()
for log in logs:
collection.insert(log)
What?
domingo 8 de mayo de 2011
17. * Make sure you understand that schemaless != mess
logs = [
{'url' : "http://www.google.com", "time" : 1304336526.011287},
{'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 }
]
def insert_log()
for log in logs:
log_to_insert = {
"url" : log.get('url', log.get('address')),
"time" : log.get('time')
}
collection.insert(log_to_insert)
What?
domingo 8 de mayo de 2011
18. * “Relate” what you occasionally need, “Embed” what you always need
message = {
'msg' : "This is a test message",
'time' : time.time(),
'user' : {
'username' : 'flaper87',
'email' : 'flaper87@flaper87.org',
}
}
What?
domingo 8 de mayo de 2011
19. * ObjectIDs have an embedded datetime
def _get(self, queue):
try:
msg = self.client.database.command("findandmodify",
"messages",
query={"queue": queue},
sort={"_id": pymongo.ASCENDING}, remove=True)
except errors.OperationFailure, exc:
if "No matching object found" in exc.args[0]:
raise Empty()
raise
return deserialize(msg["value"]["payload"])
What?
domingo 8 de mayo de 2011
20. Lets talk about mongoDB!!
Thanks!!
domingo 8 de mayo de 2011
21. Thanks!!
Lets talk about mongoDB!! Thanks 10gen!!
domingo 8 de mayo de 2011