Making Postgres Central in Your Data Center

  • 979 views
Uploaded on

Postgres has the unique ability to act as a powerful data aggregator in many data centers. This presentation shows how Postgres's extensibility, access to foreign data sources, and ability handle …

Postgres has the unique ability to act as a powerful data aggregator in many data centers. This presentation shows how Postgres's extensibility, access to foreign data sources, and ability handle NoSQL-like and data warehousing workloads gives it unmatched capabilities to function in this role.

More in: Internet , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
979
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
20
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Making Postgres Central in Your Data Center BRUCE MOMJIAN February, 2014 This talk explores why Postgres is uniquely capable of functioning as a central database in enterprises. Title concept from Josh Berkus Creative Commons Attribution License http://momjian.us/presentations 1 / 38
  • 2. Outline 1. Object-Relational (extensibility) 2. NoSQL 3. Data analytics 4. Foreign data wrappers (database federation) 5. Central role Making Postgres Central in Your Data Center 2 / 38
  • 3. 1. Object-Relational (Extensibility) Object-relational databases like Postgres support support classes and inheritance, but most importantly, they define database functionality as objects that can be easily manipulated. http://en.wikipedia.org/wiki/Object-relational_database Making Postgres Central in Your Data Center 3 / 38
  • 4. How Is this Accomplished? starelid staattnum staop pg_statistic oprleft oprright oprresult oprcom oprnegate oprlsortop oprrsortop oprcode oprrest oprjoin pg_operator typrelid typelem typinput typoutput typbasetype pg_type prolang prorettype pg_proc pg_rewrite ev_class datlastsysoid pg_database tgfoid tgrelid pg_trigger inhrelid pg_inherits inhparent pg_language pg_namespacepg_depend pg_shadow pg_aggregate aggfinalfn aggtransfn aggfnoid aggtranstype castsource casttarget pg_cast castfunc pg_description pg_constraint contypid pg_conversion conproc amopopr amopclaid pg_attribute indexrelid attnum amopclaid atttypid indrelid pg_attrdef pg_group adrelid pg_index adnum pg_am pg_amop amgettuple reltoastidxid aminsert reltoastrelid amcostestimate amproc ambeginscan pg_amproc amrescan relfilenode amendscan relam ammarkpos reltype amrestrpos pg_class ambuild opcdeftype ambulkdelete pg_opclass attrelid http://www.postgresql.org/docs/current/static/catalogs.html Making Postgres Central in Your Data Center 4 / 38
  • 5. Example: ISBN Data Type CREATE EXTENSION isn; dT List of data types Schema | Name | Description --------+--------+-------------------------------------------------- public | ean13 | International European Article Number (EAN13) public | isbn | International Standard Book Number (ISBN) public | isbn13 | International Standard Book Number 13 (ISBN13) public | ismn | International Standard Music Number (ISMN) public | ismn13 | International Standard Music Number 13 (ISMN13) public | issn | International Standard Serial Number (ISSN) public | issn13 | International Standard Serial Number 13 (ISSN13) public | upc | Universal Product Code (UPC) http://www.postgresql.org/docs/current/static/isn.html Making Postgres Central in Your Data Center 5 / 38
  • 6. ISBN Behaves Just Like Built-In Types dTS … pg_catalog | integer | -2 billion to 2 billion integer, 4-byte storage … public | isbn | International Standard Book Number (ISBN) Making Postgres Central in Your Data Center 6 / 38
  • 7. The System Catalog Entry for INTEGER SELECT * FROM pg_type WHERE typname = ’int4’; -[ RECORD 1 ]--+--------- typname | int4 typnamespace | 11 typowner | 10 typlen | 4 typbyval | t typtype | b typcategory | N typispreferred | f typisdefined | t typdelim | , typrelid | 0 typelem | 0 typarray | 1007 typinput | int4in typoutput | int4out typreceive | int4recv typsend | int4send typmodin | - typmodout | - typanalyze | - typalign | i typstorage | p typnotnull | f Making Postgres Central in Your Data Center 7 / 38
  • 8. The System Catalog Entry for ISBN SELECT * FROM pg_type WHERE typname = ’isbn’; -[ RECORD 1 ]--+--------------- typname | isbn typnamespace | 2200 typowner | 10 typlen | 8 typbyval | t typtype | b typcategory | U typispreferred | f typisdefined | t typdelim | , typrelid | 0 typelem | 0 typarray | 16405 typinput | isbn_in typoutput | public.isn_out typreceive | - typsend | - typmodin | - typmodout | - typanalyze | - typalign | d typstorage | p typnotnull | f Making Postgres Central in Your Data Center 8 / 38
  • 9. Not Just Data Types, Languages CREATE EXTENSION plpythonu; dL List of languages Name | Owner | Trusted | Description -----------+----------+---------+------------------------------------------ plpgsql | postgres | t | PL/pgSQL procedural language plpythonu | postgres | f | PL/PythonU untrusted procedural language http://www.postgresql.org/docs/current/static/plpython.html Making Postgres Central in Your Data Center 9 / 38
  • 10. Available Languages ◮ PL/Java ◮ PL/Perl ◮ PL/pgSQL (like PL/SQL) ◮ PL/PHP ◮ PL/Python. ◮ PL/R (like SPSS) ◮ PL/Ruby ◮ PL/Scheme ◮ PL/sh ◮ PL/Tcl ◮ SPI (C) http://www.postgresql.org/docs/current/static/external-pl.html Making Postgres Central in Your Data Center 10 / 38
  • 11. Specialized Indexing Methods ◮ BTree ◮ Hash ◮ GiST (generalized search tree) ◮ SP-GiST (space-partitioned GiST) ◮ GIN (generalized inverted index) http://www.postgresql.org/docs/current/static/indexam.html Making Postgres Central in Your Data Center 11 / 38
  • 12. Index Types Are Defined in the System Catalogs Too SELECT amname FROM pg_am; amname -------- btree hash gist gin spgist http://www.postgresql.org/docs/current/static/catalog-pg-am.html Making Postgres Central in Your Data Center 12 / 38
  • 13. Operators Have Similar Flexibility Operators are function calls with left and right operators of specified types: doS Schema | Name | Left arg type | Right arg type | Result type | Description … pg_catalog | + | integer | integer | integer | add dfS Schema | Name | Result data type | Argument data types | Type … pg_catalog | int4pl | integer | integer, integer | normal Making Postgres Central in Your Data Center 13 / 38
  • 14. Other Extensibility ◮ Casts are defined in pg_cast, int4(float8) ◮ Aggregates are defined in pg_aggregate, sum(int4) Making Postgres Central in Your Data Center 14 / 38
  • 15. Externally Developed Plug-Ins ◮ PostGIS (Geographical Information System) ◮ PL/v8 (server-side JavaScript) ◮ experimentation, e.g. full text search was originally externally developed Making Postgres Central in Your Data Center 15 / 38
  • 16. Offshoots of Postgres ◮ AsterDB ◮ Greenplum ◮ Informix ◮ Netezza ◮ ParAccel ◮ Postgres XC ◮ Redshift (Amazon) ◮ Truviso ◮ Vertica ◮ Yahoo! Everest https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases http://de.slideshare.net/pgconf/elephant-roads-a-tour-of-postgres-forks Making Postgres Central in Your Data Center 16 / 38
  • 17. Offshoots of Postgres https://raw.github.com/daamien/artwork/master/inkscape/PostgreSQL_timeline/timeline_postgresql.png Making Postgres Central in Your Data Center 17 / 38
  • 18. Plug-In Is Not a Bad Word Many databases treat extensions as special cases, with serious limitations. Postgres built-ins use the same API as extensions, so ll extensions operate just like built-in functionality. Making Postgres Central in Your Data Center 18 / 38
  • 19. Extensions and Built-In Facilities Behave the Same Extensions PL/R ISN PostGIS Postgres System Tables int4 btree sum() PL/pgSQL Making Postgres Central in Your Data Center 19 / 38
  • 20. 2. NoSQL SQL Making Postgres Central in Your Data Center 20 / 38
  • 21. NoSQL Types There is no single NoSQL technology. They all take different approaches and have different features and drawbacks: ◮ Key-Value stores, e.g. Redis ◮ Document databases, e.g. MongoDB (JSON) ◮ Columnar stores: Cassandra ◮ Graph databases: Neo4j Making Postgres Central in Your Data Center 21 / 38
  • 22. Why NoSQL Exists Generally, NoSQL provides fast querying, auto-sharding, and flexible schemas by avoiding: ◮ A powerful query language ◮ A sophisticated query optimizer ◮ Data normalization ◮ Joins ◮ Referential integrity ◮ Durability Making Postgres Central in Your Data Center 22 / 38
  • 23. Are These Drawbacks Worth the Cost? ◮ Difficult Reporting Data must be brought to the client for analysis, e.g. no aggregates or data analysis functions. Schema-less data requires complex client-side knowledge for processing ◮ Complex Application Design Without powerful query language and query optimizer, the client software is responsible for efficiently accessing data and for data consistency ◮ Durability Administrators are responsible for data retention Making Postgres Central in Your Data Center 23 / 38
  • 24. When Should NoSQL Be Used? ◮ Massive write scaling is required, more than a single server can provide ◮ Only simple data access pattern is required ◮ Additional resources allocation for development is acceptable ◮ Strong data retention or transactional guarantees are not required ◮ Unstructured duplicate data that greatly benefits from column compression Making Postgres Central in Your Data Center 24 / 38
  • 25. When Should Relational Storage Be Used? ◮ Easy administration ◮ Variable workloads and reporting ◮ Simplified application development ◮ Strong data retention Making Postgres Central in Your Data Center 25 / 38
  • 26. The Best of Both Worlds: Postgres Postgres has many NoSQL features without the drawbacks: ◮ Schema-less data types, with sophisticated indexing support ◮ Transactional schema changes with rapid additional and removal of columns ◮ Durability by default, but controllable per-table or per-transaction ◮ Postgres XC and PL/Proxy allow auto-sharding for write scaling Making Postgres Central in Your Data Center 26 / 38
  • 27. Schema-Less Data: JSON CREATE TABLE customer (id SERIAL, data JSON); INSERT INTO customer VALUES (DEFAULT, ’{"name" : "Bill", "age" : 21}’); SELECT data->’name’ FROM customer WHERE (data->’age’)::text = ’21’; ?column? ---------- "Bill" Making Postgres Central in Your Data Center 27 / 38
  • 28. Easy Relational Schema Changes ALTER TABLE customer ADD COLUMN status CHAR(1); BEGIN WORK; ALTER TABLE customer ADD COLUMN debt_limit NUMERIC(10,2); ALTER TABLE customer ADD COLUMN creation_date TIMESTAMP WITH TIME ZONE; ALTER TABLE customer RENAME TO cust; COMMIT; Making Postgres Central in Your Data Center 28 / 38
  • 29. 3. Data Analytics ◮ Aggregates ◮ Optimizer ◮ Server-side languages, e.g. PL/R ◮ Window functions ◮ Bitmap heap scans ◮ Tablespaces ◮ Data partitioning ◮ Materialized views ◮ Common (recursive) table expressions ◮ Min/Max indexes (coming in 2015) Data warehouse-specific solutions are required for parallel operations across servers. http://www.slideshare.net/PGExperts/really-big-elephants-postgresql-dw-15833438 http://wiki.postgresql.org/images/3/38/PGDay2009-EN-Datawarehousing_with_PostgreSQL.pdf Making Postgres Central in Your Data Center 29 / 38
  • 30. Read-Only Slaves for Analytics Network Data WarehouseMaster Server /pg_xlog/pg_xlog Making Postgres Central in Your Data Center 30 / 38
  • 31. 4. Foreign Data Wrappers (Database Federation) Foreign data wrappers (SQL MED) allow queries to read and write data to foreign data sources. Foreign database support includes: ◮ CouchDB ◮ Informix ◮ MongoDB ◮ MySQL ◮ Neo4j ◮ Oracle ◮ Postgres ◮ Redis The transfer of joins, aggregates, and sorts to foreign servers is not yet implemented. http://www.postgresql.org/docs/current/static/ddl-foreign-data.html http://wiki.postgresql.org/wiki/Foreign_data_wrappers Making Postgres Central in Your Data Center 31 / 38
  • 32. Foreign Data Wrappers to Interfaces ◮ JDBC ◮ LDAP ◮ ODBC Making Postgres Central in Your Data Center 32 / 38
  • 33. Foreign Data Wrappers to Non-Traditional Data Sources ◮ Files ◮ HTTP ◮ AWS S3 ◮ Twitter Making Postgres Central in Your Data Center 33 / 38
  • 34. Foreign Data Wrapper Example CREATE SERVER postgres_fdw_test FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host ’localhost’, dbname ’fdw_test’); CREATE USER MAPPING FOR PUBLIC SERVER postgres_fdw_test OPTIONS (password ’’); CREATE FOREIGN TABLE other_world (greeting TEXT) SERVER postgres_fdw_test OPTIONS (table_name ’world’); det List of foreign tables Schema | Table | Server --------+-------------+------------------- public | other_world | postgres_fdw_test (1 row) Foreign Postgres server name in red; foreign table name in blue Making Postgres Central in Your Data Center 34 / 38
  • 35. Read and Read/Write Data Sources Postgres ora_tab tw_tab mon_tab MongoDB Twitter Oracle Making Postgres Central in Your Data Center 35 / 38
  • 36. 5. Postgres Centrality Postgres can rightly take a central place in the data center with its: ◮ Object-Relation flexibility and extensibility ◮ NoSQL-like workloads ◮ Powerful data analytics capabilities ◮ Access to foreign data sources No other database has all of these key components. Making Postgres Central in Your Data Center 36 / 38
  • 37. Postgres’s Central Role Extensions NoSQL Postgres Warehouse Data Foreign Data Wrappers Window Functions Data Paritioning Bitmap Scans Sharding Oracle Twitter MongoDB Easy DDL JSON PL/R ISN PostGIS Making Postgres Central in Your Data Center 37 / 38
  • 38. Conclusion http://momjian.us/presentations http://flickr.com/photos/vpickering/3617513255 Making Postgres Central in Your Data Center 38 / 38