Omnibus database machine

742 views

Published on

How can you use PosgreSQL as a schemaless (NoSQL) database? Here we cover our use case and highlight upcoming features in postgres 9.4 and its integration with Django 1.7

Published in: Software
  • Be the first to comment

  • Be the first to like this

Omnibus database machine

  1. 1. omnibus database machine How to NoSQL in Postgres with Django
  2. 2. so much database I’m Aleck mainly: django + python + js github.com/alecklandgraf @aleck_landgraf aleck@buildingenergy.com I’m Gavin github.com/gmcquillan @gmcquillan
  3. 3. our use case people want to load their data… … and play with it: - django app for managing building data - unstructured data - relational data - orgs/perms/projects - utility meter ts data
  4. 4. we have unstructured data - lack of fixed schema - can create new fields without DB migration - 2,000 fields in the standard ontology - business logic (MCM) to map raw data to an ontology if possible - other logic to keep track of keys - different for each user or organization
  5. 5. we have relational data - everything we load relates to everything else in the system - buildings → organizations → permissions - buildings → utilities → meters → ts data - building v12010 → building v22011
  6. 6. unstructured and relational data - django can connect to mongo - django can connect to postgres - complex - is there something better?
  7. 7. PostgreSQL 9.2+ JSON type - native type for JSON (9.4+ jsonb) - GIN index, soon GiST index - supports search and lookup - easy to convert python dict to json - json is the de facto standard for web APIs
  8. 8. why not mongo * - MongoDB doesn’t seem to be more performant than PostgreSQL. - And you still get all of PostgreSQL’s goodies. - Larger documents will probably continue to favor PostgreSQL. - As will larger tables. *from Christophe Pettus’ talk at OSCON ‘13
  9. 9. did someone say django app? - are you ready to get beta - django-pgjson (version 0.2.0, 2014-09-13) - works best with dev postgres 9.4 and Django 1.7+ - in beta development - also a kickstarter project to help build some of this out
  10. 10. setting up your django model from django.db import models from django_pgjson.fields import JsonField class Something(models.Model): name = models.CharField(max_length=32) data = JsonField()
  11. 11. querying the JSON data >>> Something.object.bulk_create([ ... Something(data={"name": "foo", "tags": ["sad", "romantic"]}), ... Something(data={"name": "bar", "tags": ["sad", "intelligent"]}) ... ]) >>> Something.objects.filter(data__at_name="foo").count() 1 >>> Something.objects.filter(data__jcontains={"tags": ["sad"]}).count() 2
  12. 12. what about order_by and distinct? - unsupported until PSQL 9.4 with jsonb - we wrote our own inside a custom queryset if order_by not in known_columns: qs = list(qs) qs.sort( key=lambda x: getattr(x, field).get(order_by), reverse=order_by_rev )
  13. 13. JSON value order_by in SQL SELECT id, NULLIF(extra_data->>'Total GHG Emissions (MtCO2e)', '')::float AS ghg_emissions from seed_buildingsnapshot WHERE extra_data->>'Total GHG Emissions (MtCO2e)' != '' ORDER BY NULLIF(extra_data->>'Total GHG Emissions (MtCO2e)', '')::float DESC LIMIT 10; Django order_by doesn’t work on ‘non-model-fields’
  14. 14. order_by - can also push it to Postgres with a PL script - http://hyperthese.net/post/sorting-json-fields-in- postgresql/ - jsonb support in Postgres 9.4 for this, but not for json
  15. 15. is this fast?
  16. 16. who is using this? With JSONB and other enhancements in 9.4 "we now have full document storage and awesome performance with little effort," explained Craig Kerstiens, a developer at Salesforce-backed Heroku, in a personal blog post.
  17. 17. further thoughts - Postgresql full text search with JSON? - joins
  18. 18. links - https://github.com/djangonauts/django-pgjson - http://www.postgresql.org/docs/9.4/static/functions-json.html - http://thebuild.com/presentations/pg-as-nosql-pgday-fosdem-2013.pdf - http://www.postgresql.org/docs/9.4/static/datatype-json.html - http://lwn.net/Articles/599705/ - http://simko.home.cern.ch/simko/postgresql-mongodb-json-select-speed. html

×