PostgreSQL Extensibility




        Jeff Davis,
       Truviso, Inc.
Introduction
●   Many businesses have needs that are specific
    to the industry or to the business itself
    –   Not al...
And Also...




When implementing a workaround on a running
system, you want every possible option available.
Outline
●   How to solve domain-specific problems
    –   Solutions should be as seamless as a native
        feature, not...
Extensibility Overview
●   Functions – procedural code that can be written
    in a variety of languages
●   Operators – d...
Functions
●   Can be written in a variety of languages
    –   PL/pgSQL, C, Perl, Python, and more
●   Fundamental part of...
Operators
●   Allowable characters
    –   +-*/<>=~!@#%^&|`?
●   Can be infix, postfix, or prefix
●   Allow index searches
Types
●   Types help you represent data the way that
    makes sense to your business
●   User-defined types are given all...
Index Access Methods
●   BTree
●   GiST
●   GIN
●   You implement your specialized indexing
    algorithm using index supp...
Example: PostGIS
●   Spatial data – implements OpenGIS
    –   Collections of lines, points, polygons, etc.
●   Important ...
PostGIS: Types
●   Geometry type can hold
    –   points
    –   linestrings
    –   polygons
    –   etc.
●   A single ge...
PostGIS: operators
●   && - overlaps
●   << - strictly left of
●   >> - strictly right of
●   ~= - is the same as
●   ~ - ...
PostGIS: indexes
●   Uses GiST, a sophisticated generalized index
    access method with a tree structure.
●   GiST is muc...
Example: DBI-Link
●   Allows you to access any data source available
    via Perl's DBI
●   Remote data is seen as a table...
DBI-Link: PL/Perlu
●   PL/Perlu is the “untrusted” (i.e. not sandboxed)
    version of the perl procedural language.
●   C...
DBI-Link: SRFs
●   DBI-Link uses Set Returning Functions, or
    SRFs
●   Function returns multiple rows to PostgreSQL
●  ...
Example: PL/proxy
●   Access remote PostgreSQL data
●   Designed to be easy to retrieve data from many
    different Postg...
PL/proxy: example


CREATE OR REPLACE FUNCTION
    get_user_email(i_username text)
RETURNS SETOF text AS $$
  CLUSTER 'use...
Example: Full Text Search
●   Search for matches within a document.
●   The primary query is “contains”
●   Problem: BTree...
Full Text Search: Types
●   tsquery – type suitable for representing queries
    –   to_tsquery('bears & eat')
    –   'be...
Full Text Search: Indexes
●   Indexable Operator - @@
    –   Means “matches the query”
●   Index Access methods
    –   G...
Full Text Search: 8.3
●   Full Text Search lived as a fully functional
    separate module, including index support, since...
Development Landscape
●   Variety of projects outside of PostgreSQL core
    –   http://www.pgfoundry.org
●   Advantages
 ...
Conclusion
●   Extensibility allows people with knowledge in
    specific problem domains to create major
    features wit...
Upcoming SlideShare
Loading in …5
×

PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules, Performance, and More!

1,981 views
1,851 views

Published on

Jeff Davis

I'll be showing how the extensible pieces of PostgreSQL fit together to give you the full power of native functionality -- including performance. These pieces, when combined, make PostgreSQL able to do almost anything you can imagine. A variety add-ons have been very successful in PostgreSQL merely by using this extensibility. Examples in this talk will range from PostGIS (a GIS extension for PostgreSQL) to DBI-Link (manage any data source accessible via perl DBI).

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,981
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules, Performance, and More!

  1. 1. PostgreSQL Extensibility Jeff Davis, Truviso, Inc.
  2. 2. Introduction ● Many businesses have needs that are specific to the industry or to the business itself – Not always met by generic tools offered by a DBMS ● People in that business know best how to solve those problems ● But those people may not be database engine developers
  3. 3. And Also... When implementing a workaround on a running system, you want every possible option available.
  4. 4. Outline ● How to solve domain-specific problems – Solutions should be as seamless as a native feature, not bolted on. ● How to make those solutions perform like a special-purpose solution ● How to build complete features without modifying the core database engine ● Where these features fit into the PostgreSQL development landscape
  5. 5. Extensibility Overview ● Functions – procedural code that can be written in a variety of languages ● Operators – define the behavior of your type and allow it to be usefully indexed ● Types – input/output ● Index Support – define sophisticated indexing behavior for your data type ● Procedural languages
  6. 6. Functions ● Can be written in a variety of languages – PL/pgSQL, C, Perl, Python, and more ● Fundamental part of extensibility architecture ● Used for – Operators – Type input/output functions – Index support routines – etc.
  7. 7. Operators ● Allowable characters – +-*/<>=~!@#%^&|`? ● Can be infix, postfix, or prefix ● Allow index searches
  8. 8. Types ● Types help you represent data the way that makes sense to your business ● User-defined types are given all the power of native types – Not slower – Not less flexible – Not less indexable ● Define input/output, storage, etc.
  9. 9. Index Access Methods ● BTree ● GiST ● GIN ● You implement your specialized indexing algorithm using index support functions ● You don't need to worry about low level details like page splits or the transaction log (WAL)
  10. 10. Example: PostGIS ● Spatial data – implements OpenGIS – Collections of lines, points, polygons, etc. ● Important queries – Contains – Overlaps ● Existing data types can't represent the data effectively ● BTree can't index effectively, no useful total order for spatial types
  11. 11. PostGIS: Types ● Geometry type can hold – points – linestrings – polygons – etc. ● A single geometry value can hold a combination of all of those
  12. 12. PostGIS: operators ● && - overlaps ● << - strictly left of ● >> - strictly right of ● ~= - is the same as ● ~ - contains ● @ - is contained by
  13. 13. PostGIS: indexes ● Uses GiST, a sophisticated generalized index access method with a tree structure. ● GiST is much more general than BTree. It can search for matches based on what makes sense for your problem, using algorithms that you supply via functions. ● Still has all the safety of the write-ahead log.
  14. 14. Example: DBI-Link ● Allows you to access any data source available via Perl's DBI ● Remote data is seen as a table in PostgreSQL ● How?
  15. 15. DBI-Link: PL/Perlu ● PL/Perlu is the “untrusted” (i.e. not sandboxed) version of the perl procedural language. ● Can execute arbitrary code, including creating sockets, etc. ● And therefore, it can use Perl's DBI to access any data over a network.
  16. 16. DBI-Link: SRFs ● DBI-Link uses Set Returning Functions, or SRFs ● Function returns multiple rows to PostgreSQL ● Is used in place of a table.
  17. 17. Example: PL/proxy ● Access remote PostgreSQL data ● Designed to be easy to retrieve data from many different PostgreSQL nodes ● Supports partitioning ● Implemented as its own language
  18. 18. PL/proxy: example CREATE OR REPLACE FUNCTION get_user_email(i_username text) RETURNS SETOF text AS $$ CLUSTER 'usercluster'; RUN ON hashtext(i_username) ; SELECT email FROM users WHERE username = i_username; $$ LANGUAGE plproxy;
  19. 19. Example: Full Text Search ● Search for matches within a document. ● The primary query is “contains” ● Problem: BTree unsuitable, documents do not have a useful total order. ● Problem: Text type is unsuitable – Lists the same word many times, doesn't eliminate stopwords – Doesn't equate words with same root
  20. 20. Full Text Search: Types ● tsquery – type suitable for representing queries – to_tsquery('bears & eat') – 'bear' & 'eat' ● tsvector – document representation suitable for indexed searches – to_tsvector('Bears eat everything.') – 'eat':2 'bear':1 'everyth':3
  21. 21. Full Text Search: Indexes ● Indexable Operator - @@ – Means “matches the query” ● Index Access methods – GIN – better query performance and scalability – GiST – better update performance ● GiST is such a generic index interface, it can be used for indexing spatial data and full text data
  22. 22. Full Text Search: 8.3 ● Full Text Search lived as a fully functional separate module, including index support, since 2002. ● Was very successful and useful to many people ● Included in PostgreSQL 8.3 ● The fact that such a major feature was fully functional completely outside of the core PostgreSQL product shows the power of the extensible architecture.
  23. 23. Development Landscape ● Variety of projects outside of PostgreSQL core – http://www.pgfoundry.org ● Advantages – Separate release cycle – Separate development – Users can pick and choose what fits them best among alternative solutions ● Disadvantages – Can't extend SQL language
  24. 24. Conclusion ● Extensibility allows people with knowledge in specific problem domains to create major features without dealing with database internals. ● More efficient development by making independent projects ● Enables businesses to solve practical problems quickly and cheaply, without extensive error- prone “glue” code. ● Doesn't require interrupting normal operation!

×