Development of 8.3 In India

8.3
A Story of Many Patches

December 2007
FOSS.in
Josh Berkus, PostgreSQL Core Team

Many, Many Patches
E.1. Release 8.3
Release date: 2007-12-??

Release date: CURRENT AS OF 2007-10-24

E.1.1. Overview
This release represents a major leap forward for PostgreSQL by adding significant new functionality and
performance enhancements. This was made possible by a growing community that has dramatically accelerated the
pace of development. This release adds the follow major capabilities:

Full text search now fully integrated into the core database system

Support the SQL/XML standard, including new operators and an XML data type

Support for enumerated data types (ENUM)

Add Universally Unique Identifier (UUID) data type

Support arrays of composite types

Add control over whether NULLs sort first or last

Support updatable cursors

PostgreSQL 8.3 Features
● Developer ● Consistency
– SQL/XML – HOT
– Integrated TSearch2 – Load Distributed
– UUID, ENUM Checkpoint
– PL/pgSQL debugging
● Performance
● Admin – Synchronized Scan
– CSV Logging
– Asynch Commit
– Better Stats
● Accessories
– pgStandby – pgBouncer
– pgSNMP

Many Developers
Tom Lane, USA Teodor Sigaev, Russia Steve Marshall
Peter Eisentraut, Germany Alvaro Herrera, Chile Paul Bayer
Bruce Momjian, USA Mark Kirkwood, New Zealand Doug Knight
Dave Page, England Joachim Wieland Greg Sabino Mullane, USA
Pavan Deolasee, India Henry Hotz, USA Chad Wagner
Itagaki Takahiro, Japan Magnus Haglander, Sweden Brendan Jurd
Greg Smith, USA Tatsuo Ishii, Japan Euler Taviera de Oliveira, Braz
David Fetter, USA Victor Wagner Joe Conway, USA
Pavel Stehule, Czech Bill Moran, USA Simon Riggs, England
Greg Stark, England Andrew Dunstan, USA Guillaume Smet, France
Jan Wieck, USA Arul Shaji, Australia Hiroshi Saito, Japan
Oleg Bartunov, Russia Nickolay Samokhvalov, Russia Chris Marcellino, Italy
Florian Pflug Neil Conway, Canada Dave Cramer, Canada
Jeff Davis, USA Marc Fournier, Canada Devrim Gunduz, Turkey
Trevor Hardcastle Jaime Casanova, Ecuador Gavin Sherry, Australia
Nikhil S, India Albert Cervera Jeremy Drake
Holdger Schurig Bernd Helmle, Germany Marko Kreen, Estonia
D'Arcy Cain, Canada Glen Parker Kris Jurka, USA
Gevik Babakhani, Netherlands Heikki Linnakangas, Finland Tom Dunstan, USA

Many Developers
Tom Lane, USA Teodor Sigaev, Russia Steve Marshall
Peter Eisentraut, Germany Alvaro Herrera, Chile Paul Bayer
Bruce Momjian, USA Mark Kirkwood, New Zealand Doug Knight
Dave Page, England Joachim Wieland Greg Sabino Mullane, USA
Pavan Deolasee, India Henry Hotz, USA Chad Wagner
Itagaki Takahiro, Japan Magnus Haglander, Sweden Brendan Jurd
Greg Smith, USA Tatsuo Ishii, Japan Euler Taviera de Oliveira, Braz
David Fetter, USA Victor Wagner Joe Conway, USA
Pavel Stehule, Czech Bill Moran, USA Simon Riggs, England
Greg Stark, England Andrew Dunstan, USA Guillaume Smet, France
Jan Wieck, USA Arul Shaji, Australia Hiroshi Saito, Japan
Oleg Bartunov, Russia Nickolay Samokhvalov, RussiaChris Marcellino, Italy
Florian Pflug Neil Conway, Canada Dave Cramer, Canada
Jeff Davis, USA Marc Fournier, Canada Devrim Gunduz, Turkey
Trevor Hardcastle Jaime Casanova, Ecuador Gavin Sherry, Australia
Nikhil S, India Albert Cervera Jeremy Drake
Holdger Schurig Bernd Helmle, Germany Marko Kreen, Estonia
D'Arcy Cain, Canada Glen Parker Kris Jurka, USA
Gevik Babakhani, Netherlands Heikki Linnakangas, Finland Tom Dunstan, USA

Why contribute?
● PostgreSQL is a community project
– owned by the community, run by the community
– if you contribute, you are a full participant
● unlike some other databases
● Tinker with cool database stuff
– we are hard-core database geeks
– learn a lot from top database hackers
● Improve your employment prospects
– database engineers are always in demand

SQL/XML
XMLROOT ( <?xml version=’1.0’
XMLELEMENT ( standalone=’yes’ ?>
NAME ’gazonk’, <gazonk name=’val’
num=’2’>
XMLATTRIBUTES ( <qux>foo</qux>
’val’ AS ’name’, </gazonk>
1 + 1 AS ’num’),
XMLELEMENT (
NAME ’qux’,
’foo’)
table_to_xml(tbl regclass,
),
nulls boolean,
VERSION ’1.0’,
tableforest boolean,
STANDALONE YES )
targetns text)

SELECT *
FROM table1
WHERE (xpath(’//person/name/text()’,
xdata))[1]::text = ’John Smith’;

SQL/XML Prior Work
2002
● TorchBox contributes /contrib/xml2
– Some XML functionality:
● Xpath functions
● XSLT functions
– BUT
● Non-standard, completely PostgreSQL syntax
● No real data type
● Many features missing
– Charset support
– DTD support

SQL/XML Prior Work
2004
● Peter Eisentraut writes XML export
– Export table to XML
– BUT
● prototype only
● not useful without
other XML functionality
● syntax requires changing
PostgreSQL parser

SQL/XML Prior Work
2005
● Pavel Stuehle writes SQL/XML syntax demo
– First standard syntax example
– BUT
● depends on PL/perl
● prototype only
● does not integrate
with /contrib/xml2

Nikolay Samokhvalov
● Graduate Student at
University of Moscow
● Met major contributor
Oleg Bartunov in 2005
– ported MoiKrug.ru to
PostgreSQL
● Masters Thesis:
updatable XML views
in RDBMS

● Google funds 700 students to work on Open
Source
– PostgreSQL gets 7
● Nickolay proposes project for SQL/XML
– Proposal accepted
– Peter will mentor

SoC Proposal
[SoC Proposal] Initial support of XMLType for PostgreSQL
Summary
Primary goal is introduction of special type support for storing XML data in ORDBMS PostgreSQL, querying this data and
modifying it. This project is intended to develop manipulation abilities rather than special storage engine (VARCHAR as
initial storage implicit type).

At the moment there is no good general vision of most suitable storage for XMLType. Moreover, from my point of view,
DBMS should have support of different index types for XMLType - every for its special purpose. And which is more
important is an open question. That's why I propose to work on 'external' things rather than 'internals' (data structure for
index) and strictly follow standards. But anyway, I've included path index (#7 in the list of Deliverables), because now I
suppose that it is most expectative type of index (this item is optional, because here community's feedback is highly
needed).
Deliverables
Ability to define any column as of XMLType. Initially, this means that only well-formed XML documents could be stored
in such a column.
Automatic validation of XML documents being inserted/modified against XML schema, if definition of column contains it
(reference to it). DTD and/or XML Schema could be used for this.
Subset of SQL/XML standard [1] for mixing relational and XML data in queries. This includes at least following:
XMLELEMENT, XMLAGG, XMLFOREST, XMLCONCAT expressions; implementation of mapping rules for basic
types. (See Project Details for more details).
XML domains support: possibility to define domain based on XMLType, using XML schema (DTD / XML Schema).
Basic XPath support (existing experience - contrib/xml2 module - should be taken into account).
Basic XSLT support (existing experience - contrib/xml2 module - should be taken into account).
Path indexes for fast retrieval of XML documents (queries with XPath expressions in WHERE clause). [OPTIONAL]
Documentation (definition rules for XMLType, SQL/XML expressions, etc).

Mentor: Peter Eisentraut
● From Aachen, Germany
● Core Team member since 2004
● In charge of PostgreSQL Documentation
● Prior XML work

Specification Research
● ANSI SQL 2006 -- SQL/XML
XML Publishing Functions
xmlelement() Creates an XML element, allowing the name to be specified.
xmlattributes() Creates XML attributes from columns, using the name of each
column as the name of the corresponding attribute.
xmlroot() Creates the root node of an XML document.
xmlcomment() Creates an XML comment.
xmlpi() Creates an XML processing instruction.
xmlparse() Parses a string as XML and returns the resulting XML structure.
xmlforest() Creates XML elements from columns, using the name of each
column as the name of the corresponding element.
xmlconcat() Combines a list of individual XML values to create a single value
containing an XML forest.
xmlagg() Combines a collection of rows, each containing a single XML value,
to create a single value containing an XML forest.

Specification Research
● ANSI SQL 2006 -- SQL/XML

Specification Proposal
Re: Google SoC--Idea Request

From: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com>
Subject: Re: Google SoC--Idea Request
Date: Tue, 2 May 2006 12:34:43 +0400

Proposal: XMLType for PostgreSQL.

*** Minimum: ***
to have special type support for storing XML data and working with it.
This means following:
- ability to define any column of a table as of XMLType; internally,
all data is stored as VARCHAR;
- auto validation of documents against XML schema, if it was
specified in column
definition or in XML data sheets themselves (DTD, XSD or at least one
of them) /*contrib/xml2 has such feature, but it uses libxml, what
means DOM interface. Maybe it's better to use some SAX parser to solve
this task*/;
- XPath indexes for queries with path expressions in WHERE clause /*I
suppose this kind of indexes would be most frequently used. I propose
using good labeling schema and GIST and/or Gin here*/;
- some subset of SQL/XML. Actually, part 14 of SQL:200n (SQL/XML) has
more than 400 pages now and contains some established constructions,
that are using in other DBMSes. There is the some patch already
written by Pavel Stehule:
http://www.pgsql.ru/db/mw/msg.html?mid=2096818. (BTW, what is with it?
it was kept for 8.2, so what is the result?) I've tested it several
months ago, basic SQL/XML functions worked fine. It changes grammar,

Specification Revisions
XML export function signatures

From: Peter Eisentraut <peter_e ( at ) gmx ( dot ) net>
To: pgsql-hackers ( at ) postgresql ( dot ) org
Subject: XML export function signatures
Date: Mon, 12 Feb 2007 20:18:59 +0100

Here are the proposed signatures for the XML export functions.

While I have seen the output formats in use elsewhere, I could not find
any useful information on how to invoke these mappings, so the
following is purely my own invention.

table_to_xml(tbl regclass, nulls boolean, tableforest boolean, targetns text) RETURNS xml
query_to_xml(query text, nulls boolean, tableforest boolean, targetns text) RETURNS xml
table_to_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) RETURNS xml
query_to_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) RETURNS xml
table_to_xml_and_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text)
RETURNS xml
query_to_xml_and_xmlschema(query text, nulls boolean, tableforest boolean, targetns text)
RETURNS xml
cursor_get_xml(cursor refcursor, count int, nulls boolean, tableforest boolean, targetns text)
RETURNS xml
cursor_to_xmlschema(cursor refcursor, nulls boolean, tableforest boolean, targetns text) RETURNS
xml

Specification Revisions
Re: XML export function signatures

To: Andrew Dunstan <andrew ( at ) dunslane ( dot ) net>
Subject: Re: XML export function signatures
Date: Mon, 12 Feb 2007 23:57:49 +0100

Andrew Dunstan wrote:
> . table_to_xml_and_xmlschema seems like a mouthful - can we shorten
> it a bit?

Well, it gives you back a mouthful of data, too. :)

> . what are the two ways of representing data that tableforest
> distinguishes?

tableforest = false gives you something like

<tablename>
<row> 
<col1name>data</col1name>
<col2name>data</col2name>
</row>
<row>

Approved Specification
● XML Data Type
– type-safe XML storage
– supports XML operators, functions
● XML Functions
– Generation
– Manipulation (XLST)
– Export
– XPath Query
● XML Expressions
– IS DOCUMENT, etc.

Initial Versions of Patch
Updated XML patch

To: pgsql-patches ( at ) postgresql ( dot ) org
Subject: Updated XML patch
Date: Thu, 14 Dec 2006 23:02:05 +0100

Attached is an updated patch for XML functionality, which subsumes all
earlier patches on the subject. This includes a data type with format
checking, and functions to mangle values. For the moment, I have cut
out the inessential stuff such as xpath. The included regression test
file xml.sql shows some of the things that work.

This patch already covers most of the parser work. What is left
hereafter is adjusting all the corner cases, the escaping rules, and
the various XML parser behaviors.

Use configure --with-libxml to build.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Attachment: current-xml-patch.bz2Description: BZip2 compressed data

static void Initial Patch
+_outXmlExpr(StringInfo str, XmlExpr *node)
+{
+ WRITE_NODE_TYPE("XMLEXPR");
+
+ WRITE_ENUM_FIELD(op, XmlExprOp);
+ WRITE_STRING_FIELD(name);
+ WRITE_NODE_FIELD(named_args);
+ WRITE_NODE_FIELD(args);
+}
+
+static void
_outCoerceToDomain(StringInfo str, CoerceToDomain *node)
{
WRITE_NODE_TYPE("COERCETODOMAIN");
@@ -2019,6 +2030,9 @@
case T_BooleanTest:
_outBooleanTest(str, obj);
break;
+ case T_XmlExpr:
+ _outXmlExpr(str, obj);
+ break;
case T_CoerceToDomain:
_outCoerceToDomain(str, obj);
break;
diff -Nru -x configure ../cvs-pgsql/src/backend/nodes/readfuncs.c
./src/backend/nodes/readfuncs.c
--- ../cvs-pgsql/src/backend/nodes/readfuncs.c 2006-12-12 16:31:46.000000000 +0100
+++ ./src/backend/nodes/readfuncs.c 2006-12-14 21:20:08.000000000 +0100
@@ -765,6 +765,22 @@
}

Patch Revisions
Re: xml type and encodings

From: "Andrew Dunstan" <andrew ( at ) dunslane ( dot ) net>
To: "Peter Eisentraut" <peter_e ( at ) gmx ( dot ) net>
Subject: Re: xml type and encodings
Date: Mon, 15 Jan 2007 16:35:13 -0600 (CST)

Peter Eisentraut wrote:
> Florian G. Pflug wrote:
>> Couldn't the server change the encoding declaration inside the xml to
>> the correct
>> one (the same as client_encoding) before returning the result?
>
> The data type output function doesn't know what the client encoding is
> or whether the data will be shipped to the client at all. But what I'm
> thinking is that we should remove the encoding declaration if possible.
> At least that would be less confusing, albeit still potentially
> incorrect if the client continues to process the document without care.

The XML SPec says:

"In the absence of information provided by an external transport protocol
(e.g. HTTP or MIME), it is a fatal error for an entity including an
encoding declaration to be presented to the XML processor in an encoding
other than that named in the declaration, or for an entity which begins
with neither a Byte Order Mark nor an encoding declaration to use an
encoding other than UTF-8. Note that since ASCII is a subset of UTF-8,
ordinary ASCII entities do not strictly need an encoding declaration."

More Patch Revisions
xpath_array with namespaces support

From: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com>
To: PGSQL-Patches <pgsql-patches ( at ) postgresql ( dot ) org>
Subject: xpath_array with namespaces support
Date: Wed, 21 Feb 2007 02:46:33 +0300

As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.

The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])

The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:

xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1";><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)

Yet More Revisions
correct format for date, time, timestamp for XML functionality

From: "Pavel Stehule" <pavel ( dot ) stehule ( at ) hotmail ( dot ) com>
To: pgsql-patches ( at ) postgresql ( dot ) org
Subject: correct format for date, time, timestamp for XML functionality
Date: Tue, 20 Feb 2007 13:27:42 +0100

Hello,

this patch ensures independency datetime fields on current datestyle setting. Add new internal
datestyle USE_XSD_DATESTYLE. It's almoust same to USE_ISO_DATESTYLE. Differences are for timestamp:
ISO: yyyy-mm-dd hh24:mi:ss
XSD: yyyy-mm-ddThh24:mi:ss

I found one link about this topic:
http://forums.oracle.com/forums/thread.jspa?threadID=467278&tstart=0
Regards
Pavel Stehule

Patch Accepted

From: Bruce Momjian <bruce ( at ) momjian ( dot ) us>
To: "Nikolay Samokhvalov" <samokhvalov ( at ) gmail ( dot ) com>
Subject: Re: [HACKERS] xml2 contrib patch supporting default XML namespaces
Date: Thu, 22 Mar 2007 16:16:16 -0400 (EDT)

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

Write Documentation
9.14. XML Functions
The functions and function-like expressions described in this section operate on values of type xml. Check Section 8.13 for
information about the xml type. The function-like expressions xmlparse and xmlserialize for converting to and from type
xml are not repeated here. Use of many of these functions requires the installation to have been built with configure --
with-libxml.

9.14.1. Producing XML Content
A set of functions and function-like expressions are available for producing XML content from SQL data. As such, they are
particularly suitable for formatting query results into XML documents for processing in client applications.

9.14.1.1. xmlcomment
xmlcomment(text)

The function xmlcomment creates an XML value containing an XML comment with the speciﬁed text as content. The text
cannot contain -- or end with a - so that the resulting construct is a valid XML comment. If the argument is null, the result
is null.

Example:
SELECT xmlcomment('hello');

xmlcomment
--------------

XML in 8.3 Beta
E.1. Release 8.3
Release date: 2007-12-??

Release date: CURRENT AS OF 2007-10-24

E.1.1. Overview
This release represents a major leap forward for PostgreSQL by adding signiﬁcant new functionality and
performance enhancements. This was made possible by a growing community that has dramatically accelerated the
pace of development. This release adds the follow major capabilities:

Full text search now fully integrated into the core database system

Support the SQL/XML standard, including new operators and an XML data
type
Support for enumerated data types (ENUM)

Add Universally Unique Identiﬁer (UUID) data type

SQL/XML Feature Set
● XML Parsing
● XML Functions
● XML Export
● XPath B-tree Index

Future XML Projects
● Use of HSTORE for advanced XML indexing
● Automated XML decomposition
– XML-to-Table
– XML-to-Schema
● PL/XSLT
– XHTML query
● XQuery support

Fastest OSDB
J2EE Througput Acquisition Cost Comparison
900 200000

800 180000

700 160000

Cost in US Dollars
140000
600
120000
500
100000
400
80000
300
60000
200
40000

100 20000

0 0
MySQL PostgreSQL Proprietary MySQL PostgreSQL Proprietary

What's MVCC?
● Multi-Version Concurrency Control
– Each user gets their own “version” of the data
– Allows parallelization of updates/reads
– Without it, scalability is not possible
● You have to lock everything
● Or violate ACID transactions

MVCC
Row Version 1 Row Version 1

Row Version 2

Row Version 3

SELECT ... SELECT ...
BEGIN UPDATE
BEGIN UPDATE

MVCC


Row Version 3 ROLLBACK

SELECT ... SELECT ... SELECT ...
BEGIN UPDATE COMMIT
BEGIN UPDATE COMMIT

MVCC Them & Us
The Overwriting Model
InnoDB & Oracle

Base Relation Rollback Segment

Old Row Version
y
C op
UPDATE Overwrite Row

MVCC Them & Us
The Overwriting Model
InnoDB & Oracle

● Advantages
– Low table/index maintenance requirements
– Latest row version fast access
● Disadvantages
– Transaction isolation can break
– Long-running transactions expensive
– Rollbacks very expensive
– Rollback segment bottleneck

MVCC Them & Us
The Non-overwriting Model
PostgreSQL & Firebird

Base Relation

UPDATE Old Row Version

Copy

New Row Version

MVCC Them & Us
The Non-overwriting Model
PostgreSQL & Firebird

● Advantages
– Transaction isolation effortless
– Rollbacks free
– Long-running transactions not a problem
● Disadvantages
– High table/index maintenance
– “Frequently updated table” problem

Frequently Updated
Tables
Tuplestore

Row C: Version 1
small update
Indexes Updated Row C: Version 2
small update

small update


large update

Frequently Updated
Tables
Tuplestore

Row C: Version 5

Pavan Deolasee
● Graduated IIT Bombay
– focus on databases
● Worked for VERITAS
● Lead Engineer at
EnterpriseDB
– PostgreSQL vendor
– Contributes performance
patches to community
● Lives in Pune

Team Effort

● Simon Riggs ● Heikki Linnakangas, Tom
– original Lane and others
proposal – revisions
– prototypes – code review
– specification – bug fixes

Initial proposal
● Update-
in-Place Base Relation HOT File

with
HOT file Row C: Version 1

Initial proposal
● Update-

with
copy old version
A TE
U PD Row C: Version 1

Initial proposal
● Update-

with

TE
DA Row C: Version 1
UP copy old
version
tuple chain

Row C: Version 2

Initial proposal
● Update-

with

TE
DA Row C: Version 1
UP

copy old tuple chain
version
Row C: Version 2

tuple chain

Row C: Version 3

Initial proposal
● Update-
in-Place Base Relation

with

First proposal
to pgsql-hackers
Frequent Update Project: Design Overview of HOT Updates

From: "Simon Riggs" <simon ( at ) 2ndquadrant ( dot ) com>
To: <pgsql-hackers ( at ) postgresql ( dot ) org>
Subject: Frequent Update Project: Design Overview of HOT Updates
Date: Thu, 09 Nov 2006 17:13:16 +0000

Design Overview of HOT Updates
------------------------------

The objective is to increase the speed of the UPDATE case, while
minimizing the overall negative effects of the UPDATE. We refer to the
general requirement as *Frequent Update Optimization*, though this
design proposal is for Heap Overflow Tuple (HOT) Updates. It is similar
in some ways to the design for SITC already proposed, though has a
number of additional features drawn from other designs to make it a
practical and effective implementation.

EnterpriseDB have a working, performant prototype of this design. There
are still a number of issues to resolve and the intention is to follow
an open community process to find the best way forward. All required
detail will be provided for the work conducted so far.

Current PGSQL behaviour is for UPDATEs to create a new tuple version
within the heap, so acts from many perspectives as if it were an INSERT.
All of the tuple versions are chained together, so that whichever of the
tuples is visible to your Snapshot, you can walk the chain to find the
most recent tuple version to update.

Revisions:
Reverse Order
Normal Tuples HOT Relation File

Row C: Version 1
in-page update
A TE

in-page update

Row C: Version 3

in-page update

Row C: Version 4

Revisions:
Chains, not files
Normal Tuples HOT Tuple Chain

Row C: Version 1
in-page update
A TE

in-page update

Row C: Version 3

in-page update

Row C: Version 4

Add microvacuum
Normal Tuples HOT Tuple Chain

Row C: Version 1
in-page update
A TE

in-page update

microvacuum Row C: Version 3

in-page update

Row C: Version 4

new page / index update

Submit patch draft v.1
HOT WIP Patch - version 1

From: "Pavan Deolasee" <pavan ( dot ) deolasee ( at ) gmail ( dot ) com>
To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>
Subject: HOT WIP Patch - version 1
Date: Wed, 14 Feb 2007 15:34:46 +0530

This is a WIP patch based on the recent posting by Simon and discussions
thereafter. We are trying to do one piece at a time and intention is to post
the work ASAP so that we could get early and continuous feedback from
the community. We could then incorporate those suggestions in the next
WIP patch.

To start with, this patch implements HOT-update for a simple case
when there is enough free space in the same block so that it can
accommodate the new version of the tuple. A necessary condition for
doing HOT-update is that none of the index columns is changed.
The old version is marked as HEAP_UPDATE_ROOT and the new
version is marked as HEAP_ONLY_TUPLE. If a tuple is HOT-updated,
no new index entry is added.

Submit another version
HOT WIP Patch - version 2

To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>,
pgsql-patches ( at ) postgresql ( dot ) org
Subject: HOT WIP Patch - version 2
Date: Tue, 20 Feb 2007 12:08:14 +0530

Reposting - looks like the message did not get through in the first
attempt. My apologies if multiple copies are received.

This is the next version of the HOT WIP patch. Since the last patch that
I sent out, I have implemented the HOT-update chain pruning mechanism.

When following a HOT-update chain from the index fetch, if we notice that
the root tuple is dead and it is HOT-updated, we try to prune the chain to
the smallest possible length. To do that, the share lock is upgraded to an
exclusive lock and the tuple chain is followed till we find a live/recently-dead
tuple. At that point, the root t_ctid is made point to that tuple. In order to
preserve the xmax/xmin chain, the xmax of the root tuple is also updated
to xmin of the found tuple. Since this xmax is also < RecentGlobalXmin

Submit another version
HOT WIP Patch - version 3.2

To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>,
pgsql-patches ( at ) postgresql ( dot ) org
Subject: HOT WIP Patch - version 3.2
Date: Sun, 25 Feb 2007 00:06:04 +0530

Please see the attached WIP HOT patch - version 3.2. It now
implements the logic for reusing heap-only dead tuples. When a
HOT-update chain is pruned, the heap-only tuples are marked
LP_DELETE. The lp_offset and lp_len fields in the line pointer are
maintained.

When a backend runs out of free space in a page when doing an
UPDATE, it searches the line pointers to find a slot
which is marked LP_DELETEd and has enough space to accommodate
the new tuple. If such a slot is found, its reused. We might
waste some space if the slot is larger than the tuple, but
that gets reclaimed at VACUUM time.

Yet another version
HOT WIP Patch - version 6.3

To: PostgreSQL-development <pgsql-hackers ( at ) postgresql ( dot ) org>
Subject: HOT WIP Patch - version 6.3
Date: Mon, 2 Apr 2007 17:51:13 +0530

Please see the HOT version 6.3 patch posted on pgsql-patches.
I've implemented support for CREATE INDEX and CREATE INDEX
CONCURRENTLY based on the recent discussions. The implementation
is not yet complete and needs some more testing/work/discussion
before we can start considering it for review.

One of the regression test case fails because CIC now works in
three phases. In the first phase, we just create the catalog entry
for the index and commit the transaction. If the index_build fails
because of any error (say, unique key constraint) the index creation
fails, but the catalog entry remains.

Many issues resolved
● CREATE INDEX
– including CONCURRENTLY
● Re-using dead tuples
● Interaction with Cluster
● Plan invalidation
● Utilities & tools

Tom Lane says:
“break it up, please!”
● Too big a patch for reviewers
– almost 12,000 lines
● Broken up into 5 parts
– 1. The basic HOT implementation
– 2. Retain vacuum, chain pruning and other tricks
– 3. Fix the broken VACUUM and VACUUM FULL
code
– 4. Fix the broken CREATE INDEX
– pg_stats and other misc. utilities

SKYLINE OF

SKYLINE OF [DISTINCT] d1 [MIN | MAX |
DIFF], .., dm [MIN | MAX | DIFF]

SELECT *
FROM books
SKYLINE OF rating MAX, price MIN;

Extension to SQL syntax

SKYLINE OF [DISTINCT] d1
[MIN | MAX | DIFF], .., dm
[MIN | MAX | DIFF]

Approximate Queries
SELECT *
FROM Books
SKYLINE OF rating MAX, price MIN;

Problems with the Patch
Re: PostgreSQL - 'SKYLINE OF' clause added!

From: Tom Lane <tgl ( at ) sss ( dot ) pgh ( dot ) pa ( dot ) us>
To: Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz>
Subject: Re: PostgreSQL - 'SKYLINE OF' clause added!
Date: Thu, 08 Mar 2007 01:12:22 -0500

Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz> writes:
> Tom Lane wrote:
>> Well, whether it's horrible or not is in the eye of the beholder, but
>> this is certainly a non-standard syntax extension.

> Being non-standard should not be the only reason to reject a worthwhile
> feature.

No, but being non-standard is certainly an indicator that the feature
may not be of widespread interest --- if it were, the SQL committee
would've gotten around to including it; seems they've managed to include
everything but the kitchen sink already. Add to that the complete lack
of any previous demand for the feature, and you have to wonder where the
market is.

> The fact that several
> different groups have been mentioned to be working on this feature would
> indicate that it is worth considering.

Problems with the Patch
● Not part of the ANSI SQL standard
– possibly low general applicability
– might get added to standard with different syntax
– might never get standardized at all
● Requires changes to PostgreSQL parser
– new keyword break applications
– possible side effects
● Not coded to PostgreSQL standards
– would need refactoring

Rejected!
Re: PostgreSQL - 'SKYLINE OF' clause rejected

From: Tom Lane <tgl ( at ) sss ( dot ) pgh ( dot ) pa ( dot ) us>
To: Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz>
Subject: Re: PostgreSQL - 'SKYLINE OF' clause added!
Date: Sun, 11 Mar 2007 23:44:41 -0400

Shane Ambler <pgsql ( at ) Sheeky ( dot ) Biz> writes:
> If we consider this thoroughly and compile a suitable syntax that covers
> all bases it could be used as the basis of the standard definition or be
> close to what ends up in the standard.

I'll bet you a very good dinner that the word SKYLINE will never be seen
in the standard.

To me, the proposed feature seems an extremely narrow, special-purpose
thing. The SQL committee have never been into that very much, and seem
even less interested in the last couple of revisions. They like
mechanisms that can be used to solve a wide variety of problems, and
are not afraid to introduce conceptual complexity to get there.
Two examples for you: outer joins and recursive queries. Oracle's
(+) syntax is more compact than what got into the spec, but less
precise and less functional. For recursive queries, CONNECT BY is
way simpler than what got into the spec, but again doesn't cover as
much ground. The SKYLINE clause seems to me to be right about on
par with CONNECT BY ... it does something useful, but only one thing.

Mailing Lists
● Hackers list
– pgsql-hackers
– main list for development discussion
● Patch list
– pgsql-patches
– submit your patch here after discussion on -hackers
● Specific feature lists
– pgsql-jdbc, pgsql-performance, pgsql-sql, etc.
– subscribe at www.postgresql.org/community/lists

Web Sites
● www.postgresql.org
– main site
● www.pgfoundry.org
– add-ins, drivers, tools
● developer.postgresql.org
– developer wiki, including TODO lists
● archives.postgresql.org
– mailing list archives -- search for your idea here

Documentation
● www.postgresql.org/docs
– main documentation
– internals:/docs/current/static/internals.html
– code conventions: /docs/current/static/source.html
● doxygen.postgresql.org
– annotated source code
● www.postgresql.org/docs/faqs.FAQ_DEV.html
– developer FAQ

The PostgreSQL Year
RC and Branch December 2007
Development Period
Patch Commit Fest February 1, 2008

Development Period
Patch Commit Fest April 1, 2008

Development Period
Patch Commit Fest June 1, 2008
Development Period
Feature Freeze August 1, 2008
Integration & Review
(1 month)
Beta Beta Testing September, 2008
(1-2 months)

RC and Branch October, 2008

Other tips on submitting
● Don't get discouraged.
– Be prepared to argue.
– One hacker rejecting your idea doesn't mean
everyone does.
– Committers (esp. Tom Lane) are often more
concerned about maintainability than cool stuff.
● Be flexible: you will have to make changes.
– Corporate and academic coding standards are
generally lower than the project's.

Other tips on submitting
● Don't use the wrong arguments
– “MySQL/Oracle does it this way.”
– “Based on this hot academic trend.”
● Some things make a patch harder to accept
– New syntax
– Backwards compatibility issues
– High code counts
● Don't get discouraged.

Now, go write some
code.
or contribute in some easier way

Contact Information
● Josh Berkus ● Pavan Deolasee
– josh@postgresql.org – pavan.deolasee
– blogs.ittoolbox.com/ @enterprisedb.com
database/soup – www.enterprisedb.com
– www.sun.com/postgresql
● PostgreSQL India
– in@postgresql.org

This talk is copyright 2007 Josh Berkus, and is licensed under the creative commons attribution license

Development of 8.3 In India

Recommended

Recommended

More Related Content

Similar to Development of 8.3 In India

Similar to Development of 8.3 In India (20)

More from PostgreSQL Experts, Inc.

More from PostgreSQL Experts, Inc. (20)

Recently uploaded

Recently uploaded (20)

Development of 8.3 In India