How can you use PosgreSQL as a schemaless (NoSQL) database? Here we cover our use case and highlight upcoming features in postgres 9.4 and its integration with Django 1.7
Aleck LandgrafVP Software Engineering and Co-founder at Building Energy Inc.
2. so much database
I’m Aleck
mainly: django + python + js
github.com/alecklandgraf
@aleck_landgraf
aleck@buildingenergy.com
I’m Gavin
github.com/gmcquillan
@gmcquillan
3. our use case
people want to load their data…
… and play with it:
- django app for managing building data
- unstructured data
- relational data
- orgs/perms/projects
- utility meter ts data
4. we have unstructured data
- lack of fixed schema
- can create new fields without DB migration
- 2,000 fields in the standard ontology
- business logic (MCM) to map raw data to an
ontology if possible
- other logic to keep track of keys
- different for each user or organization
5. we have relational data
- everything we load relates to everything else
in the system
- buildings → organizations → permissions
- buildings → utilities → meters → ts data
- building v12010 → building v22011
6. unstructured and relational data
- django can connect to mongo
- django can connect to postgres
- complex
- is there something better?
7. PostgreSQL 9.2+ JSON type
- native type for JSON (9.4+ jsonb)
- GIN index, soon GiST index
- supports search and lookup
- easy to convert python dict to json
- json is the de facto standard for web APIs
8. why not mongo *
- MongoDB doesn’t seem to be more
performant than PostgreSQL.
- And you still get all of PostgreSQL’s goodies.
- Larger documents will probably continue to
favor PostgreSQL.
- As will larger tables.
*from Christophe Pettus’ talk at OSCON ‘13
9. did someone say django app?
- are you ready to get beta
- django-pgjson (version 0.2.0, 2014-09-13)
- works best with dev postgres 9.4 and Django
1.7+
- in beta development
- also a kickstarter project to help build some of
this out
10. setting up your django model
from django.db import models
from django_pgjson.fields import JsonField
class Something(models.Model):
name = models.CharField(max_length=32)
data = JsonField()
12. what about order_by and distinct?
- unsupported until PSQL 9.4 with jsonb
- we wrote our own inside a custom queryset
if order_by not in known_columns:
qs = list(qs)
qs.sort(
key=lambda x: getattr(x, field).get(order_by),
reverse=order_by_rev
)
13. JSON value order_by in SQL
SELECT id, NULLIF(extra_data->>'Total GHG Emissions (MtCO2e)',
'')::float AS ghg_emissions from seed_buildingsnapshot WHERE
extra_data->>'Total GHG Emissions (MtCO2e)' != '' ORDER BY
NULLIF(extra_data->>'Total GHG Emissions (MtCO2e)', '')::float
DESC LIMIT 10;
Django order_by doesn’t
work on ‘non-model-fields’
14. order_by
- can also push it to Postgres with a PL script
- http://hyperthese.net/post/sorting-json-fields-in-
postgresql/
- jsonb support in Postgres 9.4 for this, but not
for json
16. who is using this?
With JSONB and other enhancements in 9.4 "we
now have full document storage and awesome
performance with little effort," explained Craig
Kerstiens, a developer at Salesforce-backed
Heroku, in a personal blog post.