Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PostgreSQL and JSON with Python - Przemek Lewandowski

571 views

Published on

This presentation is a part of Sunscrapers’ weekly talks.

Published in: Software
  • Be the first to comment

  • Be the first to like this

PostgreSQL and JSON with Python - Przemek Lewandowski

  1. 1. sunscrapers.comYour favored tech partner. PostgreSQL and JSON with Python Przemek Lewandowski 16.06.2016
  2. 2. sunscrapers.comYour favored tech partner. 1. Whys 2. PostgreSQL types 3. HStore vs JSON vs JSONB 4. SQLAlchemy 5. Django Overview
  3. 3. sunscrapers.comYour favored tech partner. ● Schema-less data ● Schema-unknown data ● Document storage ● User defined schema Why?
  4. 4. sunscrapers.comYour favored tech partner. ● XML (since 8.2) ● HStore (since 8.2) ● JSON (since 9.2) ● JSONB (since 9.4) PostgreSQL types
  5. 5. sunscrapers.comYour favored tech partner. ● Only simple key/value pairs ● No nesting ● Strings only ● Indexing ● Many functions and operators HStore SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user; user ------ fred
  6. 6. sunscrapers.comYour favored tech partner. ● Simple validation on input ● Stored as text (like XML) ● Preserves key order and duplicates ● Indexes (only expression index) JSON SELECT '[{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}'; ?column? ------------- {"c": "foo"}
  7. 7. sunscrapers.comYour favored tech partner. ● Full JSON implementation ● Binary storage ● No key order or duplicate preservation ● Fast access operations ● Indexing ● No date type! JSONB SELECT '{"a": {"b": 2, "c": 3}}'::jsonb @> '{"a": {"c": 3}}'::jsonb;
  8. 8. sunscrapers.comYour favored tech partner. SELECT feed_data.feed_id AS feed_data_feed_id, feed_data.feed AS feed_data_feed, feed_data.data AS feed_data_data FROM Feed_data WHERE CAST(feed_data.data ->> 'DISSEMINATION_ID' AS INTEGER) <= 90000 AND (feed_data.data ->> 'ACTION') = 'NEW' ORDER BY feed_data.data -> 'PRICE_NOTATION' DESC LIMIT 100 OFFSET 0 Query example
  9. 9. sunscrapers.comYour favored tech partner. SQLAlchemy ● PostgreSQL dialect ● Close to database ● HStore, JSON, JSONB data types and much more
  10. 10. sunscrapers.comYour favored tech partner. from sqlalchemy.dialects.postgresql import JSONB # … class FeedData(db.Model): feed_id = Column(String(), primary_key=True) feed = Column(Enum(*feed_urls.keys(), name='feeds_enum'), nullable=False) data = Column(MutableDict.as_mutable(JSONB), nullable=False) def __init__(self, feed_id, feed, data): self.feed_id = feed_id self.feed = feed self.data = data def as_dict(self): data = {field: getattr(self, field) for field in ['feed_id', 'feed']} data.update(self.data) return data SQLAlchemy ORM example I
  11. 11. sunscrapers.comYour favored tech partner. def filter_by_date(db_field, dt_from, dt_to): return and_( db_field.cast(Date) >= dt_from, db_field.cast(Date) <= dt_to ) SQLAlchemy ORM example II
  12. 12. sunscrapers.comYour favored tech partner. Django ORM ● Django introduced PostgreSQL specific model fields ● HStoreField since 1.8 ● JSONField since 1.9 (uses JSONB) ● Support for migrations ● Third party libs like: django-hstore
  13. 13. sunscrapers.comYour favored tech partner. More info ● https://www.pgcon.org/2014/schedule/events/684.en.html ● https://www.postgresql.org/docs/9.4/static/datatype-json.html ● https://docs.djangoproject.com/en/1.9/ref/contrib/postgres/fields/ ● http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html
  14. 14. sunscrapers.comYour favored tech partner. Thanks! Questions?

×