Your SlideShare is downloading. ×
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Development of 8.3 In India
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Development of 8.3 In India

1,717

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,717
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 8.3A Story of Many Patches December 2007 FOSS.in Josh Berkus, PostgreSQL Core Team
  • 2. PostgreSQL India?
  • 3. PostgreSQL 8.3 In Beta
  • 4. Many, Many PatchesE.1. Release 8.3Release date: 2007-12-??Release date: CURRENT AS OF 2007-10-24E.1.1. OverviewThis release represents a major leap forward for PostgreSQL by adding significant new functionality andperformance enhancements. This was made possible by a growing community that has dramatically accelerated thepace of development. This release adds the follow major capabilities:Full text search now fully integrated into the core database systemSupport the SQL/XML standard, including new operators and an XML data typeSupport for enumerated data types (ENUM)Add Universally Unique Identifier (UUID) data typeSupport arrays of composite typesAdd control over whether NULLs sort first or lastSupport updatable cursors
  • 5. PostgreSQL 8.3 Features● Developer ● Consistency – SQL/XML – HOT – Integrated TSearch2 – Load Distributed – UUID, ENUM Checkpoint – PL/pgSQL debugging ● Performance● Admin – Synchronized Scan – CSV Logging – Asynch Commit – Better Stats ● Accessories – pgStandby – pgBouncer – pgSNMP
  • 6. Many DevelopersTom Lane, USA Teodor Sigaev, Russia Steve MarshallPeter Eisentraut, Germany Alvaro Herrera, Chile Paul BayerBruce Momjian, USA Mark Kirkwood, New Zealand Doug KnightDave Page, England Joachim Wieland Greg Sabino Mullane, USAPavan Deolasee, India Henry Hotz, USA Chad WagnerItagaki Takahiro, Japan Magnus Haglander, Sweden Brendan JurdGreg Smith, USA Tatsuo Ishii, Japan Euler Taviera de Oliveira, BrazDavid Fetter, USA Victor Wagner Joe Conway, USAPavel Stehule, Czech Bill Moran, USA Simon Riggs, EnglandGreg Stark, England Andrew Dunstan, USA Guillaume Smet, FranceJan Wieck, USA Arul Shaji, Australia Hiroshi Saito, JapanOleg Bartunov, Russia Nickolay Samokhvalov, Russia Chris Marcellino, ItalyFlorian Pflug Neil Conway, Canada Dave Cramer, CanadaJeff Davis, USA Marc Fournier, Canada Devrim Gunduz, TurkeyTrevor Hardcastle Jaime Casanova, Ecuador Gavin Sherry, AustraliaNikhil S, India Albert Cervera Jeremy DrakeHoldger Schurig Bernd Helmle, Germany Marko Kreen, EstoniaDArcy Cain, Canada Glen Parker Kris Jurka, USAGevik Babakhani, Netherlands Heikki Linnakangas, Finland Tom Dunstan, USA
  • 7. Many DevelopersTom Lane, USA Teodor Sigaev, Russia Steve MarshallPeter Eisentraut, Germany Alvaro Herrera, Chile Paul BayerBruce Momjian, USA Mark Kirkwood, New Zealand Doug KnightDave Page, England Joachim Wieland Greg Sabino Mullane, USAPavan Deolasee, India Henry Hotz, USA Chad WagnerItagaki Takahiro, Japan Magnus Haglander, Sweden Brendan JurdGreg Smith, USA Tatsuo Ishii, Japan Euler Taviera de Oliveira, BrazDavid Fetter, USA Victor Wagner Joe Conway, USAPavel Stehule, Czech Bill Moran, USA Simon Riggs, EnglandGreg Stark, England Andrew Dunstan, USA Guillaume Smet, FranceJan Wieck, USA Arul Shaji, Australia Hiroshi Saito, JapanOleg Bartunov, Russia Nickolay Samokhvalov, RussiaChris Marcellino, ItalyFlorian Pflug Neil Conway, Canada Dave Cramer, CanadaJeff Davis, USA Marc Fournier, Canada Devrim Gunduz, TurkeyTrevor Hardcastle Jaime Casanova, Ecuador Gavin Sherry, AustraliaNikhil S, India Albert Cervera Jeremy DrakeHoldger Schurig Bernd Helmle, Germany Marko Kreen, EstoniaDArcy Cain, Canada Glen Parker Kris Jurka, USAGevik Babakhani, Netherlands Heikki Linnakangas, Finland Tom Dunstan, USA
  • 8. PostgreSQL 8.3 Features● Developer ● Consistency – SQL/XML – HOT – Integrated TSearch2 – Load Distributed – UUID, ENUM Checkpoint – PL/pgSQL debugging ● Performance● Admin – Synchronized Scan – CSV Logging – Asynch Commit – Better Stats ● Accessories – pgStandby – pgBouncer – pgSNMP
  • 9. Why contribute?● PostgreSQL is a community project – owned by the community, run by the community – if you contribute, you are a full participant ● unlike some other databases● Tinker with cool database stuff – we are hard-core database geeks – learn a lot from top database hackers● Improve your employment prospects – database engineers are always in demand
  • 10. SQL/XMLXMLROOT ( <?xml version=’1.0’ XMLELEMENT ( standalone=’yes’ ?> NAME ’gazonk’, <gazonk name=’val’ num=’2’> XMLATTRIBUTES ( <qux>foo</qux> ’val’ AS ’name’, </gazonk> 1 + 1 AS ’num’), XMLELEMENT ( NAME ’qux’, ’foo’) table_to_xml(tbl regclass, ), nulls boolean, VERSION ’1.0’, tableforest boolean, STANDALONE YES ) targetns text) SELECT * FROM table1 WHERE (xpath(’//person/name/text()’, xdata))[1]::text = ’John Smith’;
  • 11. SQL/XML Prior Work 2002● TorchBox contributes /contrib/xml2 – Some XML functionality: ● Xpath functions ● XSLT functions – BUT ● Non-standard, completely PostgreSQL syntax ● No real data type ● Many features missing – Charset support – DTD support
  • 12. SQL/XML Prior Work 2004● Peter Eisentraut writes XML export – Export table to XML – BUT ● prototype only ● not useful without other XML functionality ● syntax requires changing PostgreSQL parser
  • 13. SQL/XML Prior Work 2005● Pavel Stuehle writes SQL/XML syntax demo – First standard syntax example – BUT ● depends on PL/perl ● prototype only ● does not integrate with /contrib/xml2
  • 14. Nikolay Samokhvalov ● Graduate Student at University of Moscow ● Met major contributor Oleg Bartunov in 2005 – ported MoiKrug.ru to PostgreSQL ● Masters Thesis: updatable XML views in RDBMS
  • 15. ● Google funds 700 students to work on Open Source – PostgreSQL gets 7● Nickolay proposes project for SQL/XML – Proposal accepted – Peter will mentor
  • 16. SoC Proposal[SoC Proposal] Initial support of XMLType for PostgreSQLSummaryPrimary goal is introduction of special type support for storing XML data in ORDBMS PostgreSQL, querying this data andmodifying it. This project is intended to develop manipulation abilities rather than special storage engine (VARCHAR asinitial storage implicit type).At the moment there is no good general vision of most suitable storage for XMLType. Moreover, from my point of view,DBMS should have support of different index types for XMLType - every for its special purpose. And which is moreimportant is an open question. Thats why I propose to work on external things rather than internals (data structure forindex) and strictly follow standards. But anyway, Ive included path index (#7 in the list of Deliverables), because now Isuppose that it is most expectative type of index (this item is optional, because here communitys feedback is highlyneeded).DeliverablesAbility to define any column as of XMLType. Initially, this means that only well-formed XML documents could be stored in such a column.Automatic validation of XML documents being inserted/modified against XML schema, if definition of column contains it(reference to it). DTD and/or XML Schema could be used for this.Subset of SQL/XML standard [1] for mixing relational and XML data in queries. This includes at least following:XMLELEMENT, XMLAGG, XMLFOREST, XMLCONCAT expressions; implementation of mapping rules for basictypes. (See Project Details for more details).XML domains support: possibility to define domain based on XMLType, using XML schema (DTD / XML Schema).Basic XPath support (existing experience - contrib/xml2 module - should be taken into account).Basic XSLT support (existing experience - contrib/xml2 module - should be taken into account).Path indexes for fast retrieval of XML documents (queries with XPath expressions in WHERE clause). [OPTIONAL]Documentation (definition rules for XMLType, SQL/XML expressions, etc).
  • 17. Mentor: Peter Eisentraut● From Aachen, Germany● Core Team member since 2004● In charge of PostgreSQL Documentation● Prior XML work
  • 18. Specification Research● ANSI SQL 2006 -- SQL/XMLXML Publishing Functionsxmlelement() Creates an XML element, allowing the name to be specified.xmlattributes() Creates XML attributes from columns, using the name of eachcolumn as the name of the corresponding attribute.xmlroot() Creates the root node of an XML document.xmlcomment() Creates an XML comment.xmlpi() Creates an XML processing instruction.xmlparse() Parses a string as XML and returns the resulting XML structure.xmlforest() Creates XML elements from columns, using the name of eachcolumn as the name of the corresponding element.xmlconcat() Combines a list of individual XML values to create a single valuecontaining an XML forest.xmlagg() Combines a collection of rows, each containing a single XML value,to create a single value containing an XML forest.
  • 19. Specification Research● ANSI SQL 2006 -- SQL/XML
  • 20. Specification Proposal Re: Google SoC--Idea Request From: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com> Subject: Re: Google SoC--Idea Request Date: Tue, 2 May 2006 12:34:43 +0400 Proposal: XMLType for PostgreSQL. *** Minimum: *** to have special type support for storing XML data and working with it. This means following: - ability to define any column of a table as of XMLType; internally, all data is stored as VARCHAR; - auto validation of documents against XML schema, if it was specified in column definition or in XML data sheets themselves (DTD, XSD or at least one of them) /*contrib/xml2 has such feature, but it uses libxml, what means DOM interface. Maybe its better to use some SAX parser to solve this task*/; - XPath indexes for queries with path expressions in WHERE clause /*I suppose this kind of indexes would be most frequently used. I propose using good labeling schema and GIST and/or Gin here*/; - some subset of SQL/XML. Actually, part 14 of SQL:200n (SQL/XML) has more than 400 pages now and contains some established constructions, that are using in other DBMSes. There is the some patch already written by Pavel Stehule: http://www.pgsql.ru/db/mw/msg.html?mid=2096818. (BTW, what is with it? it was kept for 8.2, so what is the result?) Ive tested it several months ago, basic SQL/XML functions worked fine. It changes grammar,
  • 21. Specification RevisionsXML export function signaturesFrom: Peter Eisentraut <peter_e ( at ) gmx ( dot ) net>To: pgsql-hackers ( at ) postgresql ( dot ) orgSubject: XML export function signaturesDate: Mon, 12 Feb 2007 20:18:59 +0100Here are the proposed signatures for the XML export functions.While I have seen the output formats in use elsewhere, I could not findany useful information on how to invoke these mappings, so thefollowing is purely my own invention.table_to_xml(tbl regclass, nulls boolean, tableforest boolean, targetns text) RETURNS xmlquery_to_xml(query text, nulls boolean, tableforest boolean, targetns text) RETURNS xmltable_to_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) RETURNS xmlquery_to_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) RETURNS xmltable_to_xml_and_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text)RETURNS xmlquery_to_xml_and_xmlschema(query text, nulls boolean, tableforest boolean, targetns text)RETURNS xmlcursor_get_xml(cursor refcursor, count int, nulls boolean, tableforest boolean, targetns text)RETURNS xmlcursor_to_xmlschema(cursor refcursor, nulls boolean, tableforest boolean, targetns text) RETURNS xml
  • 22. Specification Revisions Re: XML export function signatures From: Peter Eisentraut <peter_e ( at ) gmx ( dot ) net> To: Andrew Dunstan <andrew ( at ) dunslane ( dot ) net> Subject: Re: XML export function signatures Date: Mon, 12 Feb 2007 23:57:49 +0100 Andrew Dunstan wrote: > . table_to_xml_and_xmlschema seems like a mouthful - can we shorten > it a bit? Well, it gives you back a mouthful of data, too. :) > . what are the two ways of representing data that tableforest > distinguishes? tableforest = false gives you something like <tablename> <row> <!-- where "row" is constant --> <col1name>data</col1name> <col2name>data</col2name> </row> <row>
  • 23. Approved Specification● XML Data Type – type-safe XML storage – supports XML operators, functions● XML Functions – Generation – Manipulation (XLST) – Export – XPath Query● XML Expressions – IS DOCUMENT, etc.
  • 24. CodeModifica- tions
  • 25. Initial Versions of Patch Updated XML patch From: Peter Eisentraut <peter_e ( at ) gmx ( dot ) net> To: pgsql-patches ( at ) postgresql ( dot ) org Subject: Updated XML patch Date: Thu, 14 Dec 2006 23:02:05 +0100 Attached is an updated patch for XML functionality, which subsumes all earlier patches on the subject. This includes a data type with format checking, and functions to mangle values. For the moment, I have cut out the inessential stuff such as xpath. The included regression test file xml.sql shows some of the things that work. This patch already covers most of the parser work. What is left hereafter is adjusting all the corner cases, the escaping rules, and the various XML parser behaviors. Use configure --with-libxml to build. -- Peter Eisentraut http://developer.postgresql.org/~petere/ Attachment: current-xml-patch.bz2Description: BZip2 compressed data
  • 26. static void Initial Patch+_outXmlExpr(StringInfo str, XmlExpr *node)+{+ WRITE_NODE_TYPE("XMLEXPR");++ WRITE_ENUM_FIELD(op, XmlExprOp);+ WRITE_STRING_FIELD(name);+ WRITE_NODE_FIELD(named_args);+ WRITE_NODE_FIELD(args);+}++static void _outCoerceToDomain(StringInfo str, CoerceToDomain *node) { WRITE_NODE_TYPE("COERCETODOMAIN");@@ -2019,6 +2030,9 @@ case T_BooleanTest: _outBooleanTest(str, obj); break;+ case T_XmlExpr:+ _outXmlExpr(str, obj);+ break; case T_CoerceToDomain: _outCoerceToDomain(str, obj); break;diff -Nru -x configure ../cvs-pgsql/src/backend/nodes/readfuncs.c./src/backend/nodes/readfuncs.c--- ../cvs-pgsql/src/backend/nodes/readfuncs.c 2006-12-12 16:31:46.000000000 +0100+++ ./src/backend/nodes/readfuncs.c 2006-12-14 21:20:08.000000000 +0100@@ -765,6 +765,22 @@ }
  • 27. Patch RevisionsRe: xml type and encodingsFrom: "Andrew Dunstan" <andrew ( at ) dunslane ( dot ) net>To: "Peter Eisentraut" <peter_e ( at ) gmx ( dot ) net>Subject: Re: xml type and encodingsDate: Mon, 15 Jan 2007 16:35:13 -0600 (CST)Peter Eisentraut wrote:> Florian G. Pflug wrote:>> Couldnt the server change the encoding declaration inside the xml to>> the correct>> one (the same as client_encoding) before returning the result?>> The data type output function doesnt know what the client encoding is> or whether the data will be shipped to the client at all. But what Im> thinking is that we should remove the encoding declaration if possible.> At least that would be less confusing, albeit still potentially> incorrect if the client continues to process the document without care.The XML SPec says:"In the absence of information provided by an external transport protocol(e.g. HTTP or MIME), it is a fatal error for an entity including anencoding declaration to be presented to the XML processor in an encodingother than that named in the declaration, or for an entity which beginswith neither a Byte Order Mark nor an encoding declaration to use anencoding other than UTF-8. Note that since ASCII is a subset of UTF-8,ordinary ASCII entities do not strictly need an encoding declaration."
  • 28. More Patch Revisionsxpath_array with namespaces supportFrom: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com>To: PGSQL-Patches <pgsql-patches ( at ) postgresql ( dot ) org>Subject: xpath_array with namespaces supportDate: Wed, 21 Feb 2007 02:46:33 +0300As a result of discussion with Peter, I provide modified patch forxpath_array() with namespaces support.The signature is: _xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])The third argument is 2-dimensional array defining bindings fornamespaces. Simple examples:xmltest=# SELECT xpath_array(//text(), <local:dataxmlns:local="http://127.0.0.1";><local:piece id="1">numberone</local:piece><local:piece id="2" /></local:data>); xpath_array----------------{"number one"}(1 row)
  • 29. Yet More Revisionscorrect format for date, time, timestamp for XML functionalityFrom: "Pavel Stehule" <pavel ( dot ) stehule ( at ) hotmail ( dot ) com>To: pgsql-patches ( at ) postgresql ( dot ) orgSubject: correct format for date, time, timestamp for XML functionalityDate: Tue, 20 Feb 2007 13:27:42 +0100Hello,this patch ensures independency datetime fields on current datestyle setting. Add new internaldatestyle USE_XSD_DATESTYLE. Its almoust same to USE_ISO_DATESTYLE. Differences are for timestamp:ISO: yyyy-mm-dd hh24:mi:ssXSD: yyyy-mm-ddThh24:mi:ssI found one link about this topic:http://forums.oracle.com/forums/thread.jspa?threadID=467278&tstart=0RegardsPavel Stehule
  • 30. Patch AcceptedFrom: Bruce Momjian <bruce ( at ) momjian ( dot ) us>To: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com>Subject: Re: [HACKERS] xml2 contrib patch supporting default XML namespacesDate: Thu, 22 Mar 2007 16:16:16 -0400 (EDT)Your patch has been added to the PostgreSQL unapplied patches list at:http://momjian.postgresql.org/cgi-bin/pgpatchesIt will be applied as soon as one of the PostgreSQL committers reviewsand approves it.
  • 31. Write Documentation9.14. XML FunctionsThe functions and function-like expressions described in this section operate on values of type xml. Check Section 8.13 forinformation about the xml type. The function-like expressions xmlparse and xmlserialize for converting to and from typexml are not repeated here. Use of many of these functions requires the installation to have been built with configure --with-libxml.9.14.1. Producing XML ContentA set of functions and function-like expressions are available for producing XML content from SQL data. As such, they areparticularly suitable for formatting query results into XML documents for processing in client applications.9.14.1.1. xmlcommentxmlcomment(text)The function xmlcomment creates an XML value containing an XML comment with the specified text as content. The textcannot contain -- or end with a - so that the resulting construct is a valid XML comment. If the argument is null, the resultis null.Example:SELECT xmlcomment(hello); xmlcomment--------------<!--hello-->
  • 32. XML in 8.3 BetaE.1. Release 8.3Release date: 2007-12-??Release date: CURRENT AS OF 2007-10-24E.1.1. OverviewThis release represents a major leap forward for PostgreSQL by adding significant new functionality andperformance enhancements. This was made possible by a growing community that has dramatically accelerated thepace of development. This release adds the follow major capabilities:Full text search now fully integrated into the core database systemSupport the SQL/XML standard, including new operators and an XML data typeSupport for enumerated data types (ENUM)Add Universally Unique Identifier (UUID) data type
  • 33. SQL/XML Feature Set● XML Parsing● XML Functions● XML Export● XPath B-tree Index
  • 34. Future XML Projects● Use of HSTORE for advanced XML indexing● Automated XML decomposition – XML-to-Table – XML-to-Schema● PL/XSLT – XHTML query● XQuery support
  • 35. HOT
  • 36. Fastest OSDB J2EE Througput Acquisition Cost Comparison900 200000800 180000700 160000 Cost in US Dollars 140000600 120000500 100000400 80000300 60000200 40000100 20000 0 0 MySQL PostgreSQL Proprietary MySQL PostgreSQL Proprietary
  • 37. Most Scalable
  • 38. The Consistency Problem
  • 39. VACUUM
  • 40. Whats MVCC?● Multi-Version Concurrency Control – Each user gets their own “version” of the data – Allows parallelization of updates/reads – Without it, scalability is not possible ● You have to lock everything ● Or violate ACID transactions
  • 41. MVCCRow Version 1
  • 42. MVCCRow Version 1 Row Version 1 Row Version 2 Row Version 3SELECT ... SELECT ... BEGIN UPDATE BEGIN UPDATE
  • 43. MVCCRow Version 1 Row Version 1 Row Version 2 Row Version 2 Row Version 3 ROLLBACKSELECT ... SELECT ... SELECT ... BEGIN UPDATE COMMIT BEGIN UPDATE COMMIT
  • 44. MVCC Them & UsThe Overwriting ModelInnoDB & Oracle Base Relation Rollback Segment Old Row Version y C op UPDATE Overwrite Row
  • 45. MVCC Them & UsThe Overwriting ModelInnoDB & Oracle ● Advantages – Low table/index maintenance requirements – Latest row version fast access ● Disadvantages – Transaction isolation can break – Long-running transactions expensive – Rollbacks very expensive – Rollback segment bottleneck
  • 46. MVCC Them & UsThe Non-overwriting ModelPostgreSQL & Firebird Base Relation UPDATE Old Row Version Copy New Row Version
  • 47. MVCC Them & UsThe Non-overwriting ModelPostgreSQL & Firebird ● Advantages – Transaction isolation effortless – Rollbacks free – Long-running transactions not a problem ● Disadvantages – High table/index maintenance – “Frequently updated table” problem
  • 48. Frequently Updated Tables Tuplestore Row C: Version 1 small updateIndexes Updated Row C: Version 2 small updateIndexes Updated Row C: Version 3 small updateIndexes Updated Row C: Version 4 large updateIndexes Updated Row C: Version 5
  • 49. Frequently Updated Tables Tuplestore Row C: Version 1 small updateIndexes Updated Row C: Version 2 small updateIndexes Updated Row C: Version 3 small updateIndexes Updated Row C: Version 4 large updateIndexes Updated Row C: Version 5
  • 50. Frequently Updated Tables Tuplestore Row C: Version 5
  • 51. Frequently Updated Tables Tuplestore Row C: Version 5
  • 52. Frequently Updated Tables Tuplestore Row C: Version 5
  • 53. Poor Performance
  • 54. Pavan Deolasee ● Graduated IIT Bombay – focus on databases ● Worked for VERITAS ● Lead Engineer at EnterpriseDB – PostgreSQL vendor – Contributes performance patches to community ● Lives in Pune
  • 55. Team Effort● Simon Riggs ● Heikki Linnakangas, Tom – original Lane and others proposal – revisions – prototypes – code review – specification – bug fixes
  • 56. Meeting at EnterpriseDB
  • 57. Initial proposal● Update- in-Place Base Relation HOT File with HOT file Row C: Version 1
  • 58. Initial proposal● Update- in-Place Base Relation HOT File with HOT file Row C: Version 2 copy old version A TE U PD Row C: Version 1
  • 59. Initial proposal● Update- in-Place Base Relation HOT File with HOT file Row C: Version 3 TE DA Row C: Version 1 UP copy old version tuple chain Row C: Version 2
  • 60. Initial proposal● Update- in-Place Base Relation HOT File with HOT file Row C: Version 4 TE DA Row C: Version 1 UP copy old tuple chain version Row C: Version 2 tuple chain Row C: Version 3
  • 61. Initial proposal● Update- in-Place Base Relation with HOT file Row C: Version 4
  • 62. First proposal to pgsql-hackersFrequent Update Project: Design Overview of HOT UpdatesFrom: "Simon Riggs" <simon ( at ) 2ndquadrant ( dot ) com>To: <pgsql-hackers ( at ) postgresql ( dot ) org>Subject: Frequent Update Project: Design Overview of HOT UpdatesDate: Thu, 09 Nov 2006 17:13:16 +0000Design Overview of HOT Updates------------------------------The objective is to increase the speed of the UPDATE case, whileminimizing the overall negative effects of the UPDATE. We refer to thegeneral requirement as *Frequent Update Optimization*, though thisdesign proposal is for Heap Overflow Tuple (HOT) Updates. It is similarin some ways to the design for SITC already proposed, though has anumber of additional features drawn from other designs to make it apractical and effective implementation.EnterpriseDB have a working, performant prototype of this design. Thereare still a number of issues to resolve and the intention is to followan open community process to find the best way forward. All requireddetail will be provided for the work conducted so far.Current PGSQL behaviour is for UPDATEs to create a new tuple versionwithin the heap, so acts from many perspectives as if it were an INSERT.All of the tuple versions are chained together, so that whichever of thetuples is visible to your Snapshot, you can walk the chain to find themost recent tuple version to update.
  • 63. Revisions: Reverse Order Normal Tuples HOT Relation File Row C: Version 1 in-page update A TEU PD Row C: Version 2 in-page update Row C: Version 3 in-page update Row C: Version 4
  • 64. Revisions:Chains, not files Normal Tuples HOT Tuple Chain Row C: Version 1 in-page update A TEU PD Row C: Version 2 in-page update Row C: Version 3 in-page update Row C: Version 4
  • 65. Add microvacuum Normal Tuples HOT Tuple Chain Row C: Version 1 in-page update A TE U PD Row C: Version 2 in-page update microvacuum Row C: Version 3 in-page update Row C: Version 4 new page / index updateIndexes Updated Row C: Version 5
  • 66. Submit patch draft v.1HOT WIP Patch - version 1From: "Pavan Deolasee" <pavan ( dot ) deolasee ( at ) gmail ( dot ) com>To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>Subject: HOT WIP Patch - version 1Date: Wed, 14 Feb 2007 15:34:46 +0530This is a WIP patch based on the recent posting by Simon and discussionsthereafter. We are trying to do one piece at a time and intention is to postthe work ASAP so that we could get early and continuous feedback fromthe community. We could then incorporate those suggestions in the nextWIP patch.To start with, this patch implements HOT-update for a simple casewhen there is enough free space in the same block so that it canaccommodate the new version of the tuple. A necessary condition fordoing HOT-update is that none of the index columns is changed.The old version is marked as HEAP_UPDATE_ROOT and the newversion is marked as HEAP_ONLY_TUPLE. If a tuple is HOT-updated,no new index entry is added.
  • 67. Feature Freeze
  • 68. Submit another version HOT WIP Patch - version 2 From: "Pavan Deolasee" <pavan ( dot ) deolasee ( at ) gmail ( dot ) com> To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>, pgsql-patches ( at ) postgresql ( dot ) org Subject: HOT WIP Patch - version 2 Date: Tue, 20 Feb 2007 12:08:14 +0530 Reposting - looks like the message did not get through in the first attempt. My apologies if multiple copies are received. This is the next version of the HOT WIP patch. Since the last patch that I sent out, I have implemented the HOT-update chain pruning mechanism. When following a HOT-update chain from the index fetch, if we notice that the root tuple is dead and it is HOT-updated, we try to prune the chain to the smallest possible length. To do that, the share lock is upgraded to an exclusive lock and the tuple chain is followed till we find a live/recently-dead tuple. At that point, the root t_ctid is made point to that tuple. In order to preserve the xmax/xmin chain, the xmax of the root tuple is also updated to xmin of the found tuple. Since this xmax is also < RecentGlobalXmin
  • 69. Submit another version HOT WIP Patch - version 3.2 From: "Pavan Deolasee" <pavan ( dot ) deolasee ( at ) gmail ( dot ) com> To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>, pgsql-patches ( at ) postgresql ( dot ) org Subject: HOT WIP Patch - version 3.2 Date: Sun, 25 Feb 2007 00:06:04 +0530 Please see the attached WIP HOT patch - version 3.2. It now implements the logic for reusing heap-only dead tuples. When a HOT-update chain is pruned, the heap-only tuples are marked LP_DELETE. The lp_offset and lp_len fields in the line pointer are maintained. When a backend runs out of free space in a page when doing an UPDATE, it searches the line pointers to find a slot which is marked LP_DELETEd and has enough space to accommodate the new tuple. If such a slot is found, its reused. We might waste some space if the slot is larger than the tuple, but that gets reclaimed at VACUUM time.
  • 70. Yet another versionHOT WIP Patch - version 6.3From: "Pavan Deolasee" <pavan ( dot ) deolasee ( at ) gmail ( dot ) com>To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>Subject: HOT WIP Patch - version 6.3Date: Mon, 2 Apr 2007 17:51:13 +0530Please see the HOT version 6.3 patch posted on pgsql-patches.Ive implemented support for CREATE INDEX and CREATE INDEXCONCURRENTLY based on the recent discussions. The implementationis not yet complete and needs some more testing/work/discussionbefore we can start considering it for review.One of the regression test case fails because CIC now works inthree phases. In the first phase, we just create the catalog entryfor the index and commit the transaction. If the index_build failsbecause of any error (say, unique key constraint) the index creationfails, but the catalog entry remains.
  • 71. Many issues resolved● CREATE INDEX – including CONCURRENTLY● Re-using dead tuples● Interaction with Cluster● Plan invalidation● Utilities & tools
  • 72. But still not reviewed
  • 73. Tom Lane says: “break it up, please!”● Too big a patch for reviewers – almost 12,000 lines● Broken up into 5 parts – 1. The basic HOT implementation – 2. Retain vacuum, chain pruning and other tricks – 3. Fix the broken VACUUM and VACUUM FULL code – 4. Fix the broken CREATE INDEX – pg_stats and other misc. utilities
  • 74. Code Reviewed
  • 75. PostgreSQL Beta
  • 76. HOT Performance
  • 77. SKYLINE OFSKYLINE OF [DISTINCT] d1 [MIN | MAX |DIFF],  .., dm [MIN | MAX | DIFF]SELECT *FROM booksSKYLINE OF rating MAX, price MIN;
  • 78. CDE @ IIIT, Hyderabad
  • 79. Feature proposed 3/3
  • 80. Extension to SQL syntax SKYLINE OF [DISTINCT] d1 [MIN | MAX | DIFF],  .., dm [MIN | MAX | DIFF]
  • 81. Approximate Queries
  • 82. Approximate QueriesSELECT *FROM BooksSKYLINE OF rating MAX, price MIN;
  • 83. Lots of discussion
  • 84. Problems with the Patch Re: PostgreSQL - SKYLINE OF clause added! From: Tom Lane <tgl ( at ) sss ( dot ) pgh ( dot ) pa ( dot ) us> To: Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz> Subject: Re: PostgreSQL - SKYLINE OF clause added! Date: Thu, 08 Mar 2007 01:12:22 -0500 Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz> writes: > Tom Lane wrote: >> Well, whether its horrible or not is in the eye of the beholder, but >> this is certainly a non-standard syntax extension. > Being non-standard should not be the only reason to reject a worthwhile > feature. No, but being non-standard is certainly an indicator that the feature may not be of widespread interest --- if it were, the SQL committee wouldve gotten around to including it; seems theyve managed to include everything but the kitchen sink already. Add to that the complete lack of any previous demand for the feature, and you have to wonder where the market is. > The fact that several > different groups have been mentioned to be working on this feature would > indicate that it is worth considering.
  • 85. Problems with the Patch● Not part of the ANSI SQL standard – possibly low general applicability – might get added to standard with different syntax – might never get standardized at all● Requires changes to PostgreSQL parser – new keyword break applications – possible side effects● Not coded to PostgreSQL standards – would need refactoring
  • 86. Rejected!Re: PostgreSQL - SKYLINE OF clause rejectedFrom: Tom Lane <tgl ( at ) sss ( dot ) pgh ( dot ) pa ( dot ) us>To: Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz>Subject: Re: PostgreSQL - SKYLINE OF clause added!Date: Sun, 11 Mar 2007 23:44:41 -0400Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz> writes:> If we consider this thoroughly and compile a suitable syntax that covers> all bases it could be used as the basis of the standard definition or be> close to what ends up in the standard.Ill bet you a very good dinner that the word SKYLINE will never be seenin the standard.To me, the proposed feature seems an extremely narrow, special-purposething. The SQL committee have never been into that very much, and seemeven less interested in the last couple of revisions. They likemechanisms that can be used to solve a wide variety of problems, andare not afraid to introduce conceptual complexity to get there.Two examples for you: outer joins and recursive queries. Oracles(+) syntax is more compact than what got into the spec, but lessprecise and less functional. For recursive queries, CONNECT BY isway simpler than what got into the spec, but again doesnt cover asmuch ground. The SKYLINE clause seems to me to be right about onpar with CONNECT BY ... it does something useful, but only one thing.
  • 87. Solution: pgFoundry
  • 88. Contributor Resources
  • 89. Mailing Lists● Hackers list – pgsql-hackers – main list for development discussion● Patch list – pgsql-patches – submit your patch here after discussion on -hackers● Specific feature lists – pgsql-jdbc, pgsql-performance, pgsql-sql, etc. – subscribe at www.postgresql.org/community/lists
  • 90. Web Sites● www.postgresql.org – main site● www.pgfoundry.org – add-ins, drivers, tools● developer.postgresql.org – developer wiki, including TODO lists● archives.postgresql.org – mailing list archives -- search for your idea here
  • 91. Documentation● www.postgresql.org/docs – main documentation – internals:/docs/current/static/internals.html – code conventions: /docs/current/static/source.html● doxygen.postgresql.org – annotated source code● www.postgresql.org/docs/faqs.FAQ_DEV.html – developer FAQ
  • 92. The PostgreSQL YearRC and Branch December 2007 Development Period Patch Commit Fest February 1, 2008 Development Period Patch Commit Fest April 1, 2008 Development Period Patch Commit Fest June 1, 2008 Development PeriodFeature Freeze August 1, 2008 Integration & Review (1 month)Beta Beta Testing September, 2008 (1-2 months)RC and Branch October, 2008
  • 93. Other tips on submitting● Dont get discouraged. – Be prepared to argue. – One hacker rejecting your idea doesnt mean everyone does. – Committers (esp. Tom Lane) are often more concerned about maintainability than cool stuff.● Be flexible: you will have to make changes. – Corporate and academic coding standards are generally lower than the projects.
  • 94. Other tips on submitting● Dont use the wrong arguments – “MySQL/Oracle does it this way.” – “Based on this hot academic trend.”● Some things make a patch harder to accept – New syntax – Backwards compatibility issues – High code counts● Dont get discouraged.
  • 95. Now, go write some code.or contribute in some easier way
  • 96. Contact Information● Josh Berkus ● Pavan Deolasee – josh@postgresql.org – pavan.deolasee – blogs.ittoolbox.com/ @enterprisedb.com database/soup – www.enterprisedb.com – www.sun.com/postgresql● PostgreSQL India – in@postgresql.org This talk is copyright 2007 Josh Berkus, and is licensed under the creative commons attribution license

×