0
Moma-Django
Overview
Django Boston meetup, 02-27-2014
Django + MongoDB:
building a custom ORM layer
Overview of the talk:
moma-django is a MongoDB manager for Django. It provid...
Who are we?
 Company: Cloudoscope.com
 What we do:
– Cloudoscope’s product enable IT vendors to automate the presales pr...
Why moma-django?
 Certain problems can be addressed well with NoSQL
 The team wants to experiment with a NoSQL
HOWEVER:
...
Why moma-django? (our example)
 Needed a very efficient way of processing timeseries
 The timeseries where constantly gr...
Other packages
 PyMongo – a dependency for moma-django
 MongoEngine – somewhat similar concepts in terms of
models

 No...
“Native” - advantages
 Django packages and plugins (e.g. Admin functionality)
 Using similar code conventions
 Easier t...
Let’s make it interactive
Questions Anyone??? (Example Application)
 Small question asking application
 Allows voting an...
Migrating an existing model
class TstBook(models.Model):
name = models.CharField(max_length=64)
publish_date = MongoDateTi...
Migrating an existing model (2)
 Syncdb:

 Add objects
Migrating an existing model (2)
 Syncdb:

 Add objects
>>> TstBook(name=“Good night half moon”, publish_date=datetime.da...
Migrating an existing model (3)
 Breaching uniqueness  try and save the same object again:
Migrating an existing model (4)
 In Mongo: content, indexes

class Meta:
unique_together = ['name', 'author']

 Admin
New field types
 MongoIDField – Internal. Used to hold the MongoDB object
ID
 MongoDateTimeField – Used for Datetime
 V...
Queries and update – 1: bulk insert
records.append(
{ "_id" : ObjectId("502abdabf7f16836f100285a"), "time_on_site" : 290,
...
Queries and update – 2: examples
def ISODate(timestr):
res = datetime.strptime(timestr, "%Y-%m-%dT%H:%M:%SZ")
res = res.re...
Queries and update– 3: examples
# Different query optimizations
qs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time...
Queries – 4: extensions beyond standard Django
# Dot notation
qs = UniqueVisit.objects.filter(location__rg__exact ="New Yo...
Queries - by the structure of documents
# How many documents in the DB?
>>> UniqueVisit.objects.all().count()
20
>>> # For...
Manipulating documents payload
# Model
class Question(MongoModel):

user = models.ForeignKey(User)
date = MongoDateTimeFi...
Admin interface

So – what’s next?
 Github: https://github.com/gadio/moma-django
 If you want to contribute – please contact (forking is ...
Backup
South
 Dealing with apps with mixed models  South to disregard
the model
# Enabling South for the non conventional mongo...
Unit testing
 The model name is defined in settings.py
 In unit testing run, a new mongo DB schema is created
MONGO_COLL...
Moma-django on google…
Upcoming SlideShare
Loading in...5
×

moma-django overview --> Django + MongoDB: building a custom ORM layer

1,492

Published on

moma-django is a MongoDB manager for Django. It provides native Django ORM support for MongoDB documents, including the query API and the admin interface. It was developed as a part of two commercial products and released as an open source. In the talk we will review the motivation behind its developments, its features and go through 2-3 examples of how to use some of the features: migrating an existing model, advanced queries and the admin interface. If time permits we will discuss unit testing and south migrations.

Please find the video at: http://www.youtube.com/watch?v=cxQKTDLjb-w

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,492
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
17
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Transcript of "moma-django overview --> Django + MongoDB: building a custom ORM layer"

  1. 1. Moma-Django Overview Django Boston meetup, 02-27-2014
  2. 2. Django + MongoDB: building a custom ORM layer Overview of the talk: moma-django is a MongoDB manager for Django. It provides native Django ORM support for MongoDB documents, including the query API and the admin interface. It was developed as a part of two commercial products and released as an open source. In the talk we will review the motivation behind its developments, its features and go through 2-3 examples of how to use some of the features: migrating an existing model, advanced queries and the admin interface. If time permits we will discuss unit testing and south migrations
  3. 3. Who are we?  Company: Cloudoscope.com  What we do: – Cloudoscope’s product enable IT vendors to automate the presales process by collecting and analyzing prospect IT performance – Previous product - Lucidel: B2C marketing analytics based on website data – Data intensive projects / sites, NoSQL, analytics focus (as a way of funding)  Gadi Oren: @gadioren, gadioren
  4. 4. Why moma-django?  Certain problems can be addressed well with NoSQL  The team wants to experiment with a NoSQL HOWEVER:  A lot of code needs to be rewritten  Team  learn a new API  Some of the tools and procedures are no longer functioning and should be replaced – Admin interface – Unit testing environment  Some of the data need to be somewhat de-normalized*
  5. 5. Why moma-django? (our example)  Needed a very efficient way of processing timeseries  The timeseries where constantly growing  We required very detailed search/slice/dice capabilities to find the timeseries to be processed  Some of the data was optional (e.g. demographics information was never complete)  Document size, content and structure varied widely  However, we have a small distributed team and we did not want to create a massive project  We started experimenting using a stub Manager doing small iterations, adding functionality as we needed over nine months
  6. 6. Other packages  PyMongo – a dependency for moma-django  MongoEngine – somewhat similar concepts in terms of models  Non relational versions of Django
  7. 7. “Native” - advantages  Django packages and plugins (e.g. Admin functionality)  Using similar code conventions  Easier to bring in new team members  Use the same unit testing frameworks (e.g. Jenkins)  Simple experimentation and migration path
  8. 8. Let’s make it interactive Questions Anyone??? (Example Application)  Small question asking application  Allows voting and adding images  Implemented as a django application over MongoDB, using moma-django  Register and login at http://momadjango.org  Ask away!
  9. 9. Migrating an existing model class TstBook(models.Model): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author'] class TstAuthor(models.Model): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32) class TstBook(MongoModel): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author'] class TstAuthor(MongoModel): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32) models.signals.post_syncdb.connect(post_syncdb_ mongo_handler)
  10. 10. Migrating an existing model (2)  Syncdb:  Add objects
  11. 11. Migrating an existing model (2)  Syncdb:  Add objects >>> TstBook(name=“Good night half moon”, publish_date=datetime.datetime(2014,2,20), author=TstAuthor.objects.get(first_name=“Gadi”)).save()
  12. 12. Migrating an existing model (3)  Breaching uniqueness  try and save the same object again:
  13. 13. Migrating an existing model (4)  In Mongo: content, indexes class Meta: unique_together = ['name', 'author']  Admin
  14. 14. New field types  MongoIDField – Internal. Used to hold the MongoDB object ID  MongoDateTimeField – Used for Datetime  ValuesField – Used to represent a list of objects of any type  StringListField – Used for a list of strings  DictionaryField – Used as a dictionary  Current limitation: nested structures have limited support
  15. 15. Queries and update – 1: bulk insert records.append( { "_id" : ObjectId("502abdabf7f16836f100285a"), "time_on_site" : 290, "user_id" : 1154449631, "account_id" : NumberLong(5), "campaign" : "(not set)", "first_visit_date" : ISODate("2012-07-30T17:10:06Z"), "referral_path" : "(not set)", "source" : "google", "exit_page_path" : "/some-analysis/lion-king/", "landing_page_path" : "(not set)", "keyword" : "wikipedia lion king", "date" : ISODate("2012-07-30T00:00:00Z"), "visit_count" : 1, "page_views" : 3, "visit_id" : "false---------------1154449631.1343668206", "goal_values" : { }, "goal_starts" : { }, "demographics" : { }, "goal_completions" : { }, "location" : { "cr" : "United States", "rg" : "California", "ct" : "Pasadena" }, }) UniqueVisit.objects.filter(account__in=self.list_of_accounts).delete() UniqueVisit.objects.bulk_insert( records )
  16. 16. Queries and update – 2: examples def ISODate(timestr): res = datetime.strptime(timestr, "%Y-%m-%dT%H:%M:%SZ") res = res.replace(tzinfo=timezone.utc) return res # Datetime qs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z")) self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}) ) # Multiple conditions qs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z"), time_on_site__gt =10, page_views__gt =2) self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$gt': 10.0}, 'page_views': {'$gt': 2}, 'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}} ))
  17. 17. Queries and update– 3: examples # Different query optimizations qs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275)) self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$in': [10.0, 25.0, 275.0]}} )) # Multiple or Q expressions qs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275)|Q(source = 'bing')) self.assertEqual( qs.query.spec, dict( # pymongo expression {'$or': [{'time_on_site': 10.0}, {'time_on_site': 25.0}, {'time_on_site': 275.0}, {'source': 'bing'}]} )) # Negate Q qs = UniqueVisit.objects.filter(~Q(first_visit_date =ISODate("2012-07-30T12:29:05Z"))) self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$ne': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}} ))
  18. 18. Queries – 4: extensions beyond standard Django # Dot notation qs = UniqueVisit.objects.filter(location__rg__exact ="New York") self.assertEqual( qs.query.spec, dict(( # pymongo expression {'location.rg': 'New York'} )) # Check key existence qs = UniqueVisit.objects.filter(demographics__age__exists ="true") self.assertEqual( qs.query.spec, dict(( # pymongo expression {'demographics.age': {'$exists': 'true'}} )) # variable type qs = UniqueVisit.objects.filter(landing_page_path__type = int) self.assertEqual( qs.query.spec, dict(( # pymongo expression {'landing_page_path': {'$type': 16}} ))
  19. 19. Queries - by the structure of documents # How many documents in the DB? >>> UniqueVisit.objects.all().count() 20 >>> # For how many documents in the DB do we have age information? >>> UniqueVisit.objects.filter(demographics__age__exists ="true").count() 7 >>> # For how many documents in the DB do we have gender information? >>> UniqueVisit.objects.filter(demographics__gender__exists ="true").count() 3 >>> # For how many documents in the DB do we have gender and age information? >>> UniqueVisit.objects.filter(demographics__age__exists ="true“, demographics__gender__exists ="true").count() 1 >>>
  20. 20. Manipulating documents payload # Model class Question(MongoModel):  user = models.ForeignKey(User) date = MongoDateTimeField(db_index=True) question = models.CharField(max_length=256 ) docs = DictionaryField(models.CharField()) image = DictionaryField(models.TextField()) audio = DictionaryField() other = DictionaryField() vote_ids = ValuesField(models.IntegerField()) def __unicode__(self): return u'%s[%s %s]' % (self.question, self.date, self.user, ) class Meta: unique_together = ['user', 'question',] # Store an image: get the image from the “POST” upload form (snippet) docfile = request.FILES['docfile'] question_id = form.cleaned_data['question_id'] docfile_name = docfile.name docfile_name_changed = _replace_dots(docfile.name) question = Question.objects.get(id=question_id) # Store meta-data question.docs.update({docfile_name_changed : docfile.content_type}) question.image.update( {docfile_name_changed +'_url' : '/static/display/s_'+docfile_name, docfile_name_changed +'_name' : docfile_name, docfile_name_changed +'_content_type' : docfile.content_type}) # Store the actual image binary block (small scale implementation) file_read = docfile.file.read() # Note – this is a naïve implementation! file_data = base64.b64encode(file_read) question.image.update({docfile_name_changed +'_data' : file_data}) question.save()
  21. 21. Admin interface 
  22. 22. So – what’s next?  Github: https://github.com/gadio/moma-django  If you want to contribute – please contact (forking is also an option)  Contact: gadi.oren.1 at gmail.com or gadi at Cloudoscope.com
  23. 23. Backup
  24. 24. South  Dealing with apps with mixed models  South to disregard the model # Enabling South for the non conventional mongo model add_introspection_rules( [ ( (MongoIdField, MongoDateTimeField, DictionaryField ), [], { "max_length": ["max_length", {"default": None}], }, ), ], ["^moma_django.fields.*",])
  25. 25. Unit testing  The model name is defined in settings.py  In unit testing run, a new mongo DB schema is created MONGO_COLLECTION prefixed with “test_”(e.g. test_momaexample) MONGO_HOST = 'localhost' MONGO_PORT = 27017 MONGO_COLLECTION = 'momaexample'
  26. 26. Moma-django on google…
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×