Developing A Procedural Language For Postgre Sql

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Developing A Procedural Language For Postgre Sql - Presentation Transcript

    1. Developing a Procedural Language for PostgreSQL  
    2. I CAN HAZ P.L.Z PLZ?  
    3. MKAY... WHY? To learn how
    4. MKAY... WHY? To learn how To get to speak at conferences
    5. MKAY... WHY? To learn how To get to speak at conferences Achieve world domination
    6. MKAY... WHY? To learn how To get to speak at conferences Achieve world domination
    7. UM, WHAT?
    8. WHAT'S LOLCODE? 1 Oh hai. In teh beginnin 1 In the beginning, some Guy Ceiling Cat maded teh skiez made an Image of a cat. And An da Urfs, but he did not the picture was without eated dem. caption, and void. 2 Da Urfs no had shapez An 2 And said Guy did make a haded dark face, An Ceiling caption unto his Image. And Cat rode invisible bike over it was funny, and the Guy did teh waterz. call it good. 3 At start, no has lyte. An 3 And this guy's Image did Ceiling Cat sayz, i can haz become an internet meme, all lite? An lite wuz. Web 2.0-ish and stuff. 4 An Ceiling Cat sawed teh 4 There was much rejoicing. lite, to seez stuffs...
    9. WHAT'S LOLCODE A lolcat is an image combining a photograph, most frequently a cat, with a humorous and idiosyncratic caption in (often) broken English—a dialect which is known as \"lolspeak\", or \"kitteh\". The name \"lolcat\" is a compound word of \"LOL\" and \"cat\".[1] Another name is cat macro, being a type of image macro.[2] Lolcats are created for photo sharing imageboards and other internet forums. Lolcats are similar to other anthropomorphic animal-based image macros such as the O RLY?  owl.[3]
    10. UM... OKAY...
    11. SO, LOLCODE? Adam Lindsay had a revelation. From Ceiling Cat. \"This is a love letter to very clever people who are slightly bored. I had no idea there were so many of us out there.\"   - FAQ, lolcode.com
    12. CHEEZBURGER  
    13. CHEEZBURGER   SPEEK LOLCODE
    14. CHEEZBURGER   SPEEK LOLCODE ANATUMMY OF A PL
    15. CHEEZBURGER   SPEEK LOLCODE FUNKSHUN KALL HANDLRZ ANATUMMY OF A PL
    16. CHEEZBURGER   SPEEK LOLCODE FUNKSHUN KALL HANDLRZ MAEK UN INTURPRETR ANATUMMY OF A PL
    17. CHEEZBURGER   SPEEK LOLCODE
    18. I CAN SPEEK LOLCODE Data types NUMBR: Integer values NUMBAR: Floating point values YARN: String values NOOB: Null values TROOF: Boolean values (WIN / FAIL)
    19. I CAN SPEEK LOLCODE Operators Arithmetic SUM OF x AN y, DIFF OF x AN y PRODUKT OF x AN y, QUOSHUNT OF x AN y MOD OF x AN y BIGGR OF x AN y, SMALLR OF x AN y   Boolean BOTH OF x AN y EITHER OF x AN y WON OF x AN y ALL OF x AN y [AN ...] MKAY ANY OF x AN y [AN ...] MKAY NOT x
    20. I CAN SPEEK LOLCODE Operators (cont.) Comparison BOTH SAEM x AN y WIN iff x == y DIFFRINT x AN y FAIL iff x == y Concatenation SMOOSH x y z p d q ... MKAY  Concatenates infinitely many YARNs Casting MAEK x A <type> x IS NOW A <type>
    21. I CAN SPEEK LOLCODE Comparison   BOTH SAEM ANIMAL AN \"CAT\", O RLY? YA RLY, VISIBLE \"J00 HAV A CAT\" MEBBE BOTH SAEM ANIMAL AN \"MAUS\" VISIBLE \"JOO HAV A MAUS? WTF?\" NO WAI, VISIBLE \"J00 SUX\" OIC
    22. I CAN SPEEK LOLCODE Case COLOR, WTF? OMG \"R\" VISIBLE \"RED FISH\" GTFO OMG \"Y\" VISIBLE \"YELLOW FISH\" OMG \"G\" OMG \"B\" VISIBLE \"FISH HAS A FLAVOR\" GTFO OMGWTF VISIBLE \"FISH IS TRANSPARENT\" OIC
    23. I CAN SPEEK LOLCODE Loops   IM IN YR <label> <operation> YR <variable> [TIL|WILE <expression>] <code block> IM OUTTA YR <label>   IT variable The default when nothing else is specified Similar to Perl's $_
    24. I CAN SPEEK PL/LOLCODE PL/LOLCODE-specific stuff VISIBLE == PL/pgSQL RAISE Alt. VISIBLE <level> \"Message\" FOUND YR <val> == PL/pgSQL RETURN GIMMEH <var> OUTTA DATABUKKIT \"query\" Database interface (SPI) Function arguments are named LOL1, LOL2, etc. automatically PL/LOLCODE isn't smart enough yet to let you give them arbitrary names
    25. I CAN SPEEK PL/LOLCODE PL/LOLCODE-specific stuff VISIBLE == PL/pgSQL RAISE Alt. VISIBLE <level> \"Message\" FOUND YR <val> == PL/pgSQL RETURN GIMMEH <var> OUTTA DATABUKKIT \"query\" Database interface (SPI) LIAR LIAR PANTZ Function arguments are named LOL1, LOL2, etc. automatically ON FIRE!!!1 PL/LOLCODE isn't smart enough yet to let you give them arbitrary names
    26. I CAN SPEEK PL/LOLCODE PL/LOLCODE-specific stuff VISIBLE == PL/pgSQL RAISE Alt. VISIBLE <level> \"Message\" FOUND YR <val> == PL/pgSQL RETURN GIMMEH <var> OUTTA DATABUKKIT \"query\" Database interface (SPI) Function arguments are named LOL1, LOL2, etc. automatically PL/LOLCODE isn't smart enough yet to let you give them arbitrary names As of a few nights ago, this is no longer true. W00T!!1
    27. I CAN SPEEK PL/LOLCODE HAI I HAS A X ITZ SMOOSH \"MONORAIL\" \"KITTY\" MKAY I HAS A Y ITZ 3 I HAS A Z ITZ BIGGR OF Y AN 15 BOTH SAEM Z AN Y, O RLY? YA RLY, VISIBLE \"SUMTHINZ RONG!!\" NO WAI, VISIBLE \"THX. CAN I HAZ CHEEZBURGR?\" OIC IM IN YR MYLOOP Y R DIFF OF Y AN 1 DIFFRINT Y AN BIGGR OF Y AN 5, O RLY? YA RLY, GTFO OIC IM OUTTA YR MYLOOP KTHXBYE
    28. CHEEZBURGER   SPEEK LOLCODE ANATUMMY OF A PL
    29. ANATUMMY OF A PL Function call handler plpgsql_call_handler, plperl_call_handler, etc. This function is responsible for executing each function call for a particular language CREATE LANGUAGE Must be written in a compiled language Must take language_handler as a parameter Function validator plpgsql_validator, plperl_validator Optional Validates functions when they're created Takes an OID argument TRUSTED vs. UNTRUSTED
    30. ANATUMMY OF A PL Where's the interpreter? Some PLs come with their own interpreter PL/pgSQL, SQL/PSM, PL/LOLCODE, PL/Proxy Others use an interpreter from a shared library PL/Perl, PL/Python, PL/R, most everything else Some do other strange things PL/J Talks to a Java virtual machine via RMI calls PLs with built-in interpreters can use PostgreSQL's internal data types to avoid conversion overhead
    31. ANATUMMY OF A PL Where's the interpreter? Some PLs come with their own interpreter PL/pgSQL, SQL/PSM, PL/LOLCODE, PL/Proxy Others use an interpreter from a shared library PL/Perl, PL/Python, PL/R, most everything else Some do other strange things PL/J Talks to a Java virtual machine via RMI calls PLs with built-in interpreters can use PostgreSQL's internal data types to avoid conversion overhead PL/LOLCODE isn't that smart yet
    32. ANATUMMY OF A PL Compilation Commonly a function will be \"compiled\" the first time it is run in a session The function call handler looks up argument and return types only once Sometimes keeps the parse tree in memory Avoids having to re-parse the function next time it's called
    33. ANATUMMY OF A PL Compilation Commonly a function will be \"compiled\" the first time it is run in a session The function call handler looks up argument and return types only once Sometimes keeps the parse tree in memory Avoids having to re-parse the function next time it's called PL/LOLCODE isn't that smart yet
    34. CHEEZBURGER   SPEEK LOLCODE FUNKSHUN KALL HANDLRZ ANATUMMY OF A PL
    35. FUNKSHUN KALL HANDLRZ Job of the function handler Figure out if it's a TRIGGER function Determine argument values Determine return type Find the function source ...alternatively, find a pre-stored parse tree Execute the function Return the proper values
    36. FUNKSHUN KALL HANDLRZ PG_MODULE_MAGIC; Datum pl_lolcode_call_handler (PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(pl_lolcode_call_handler); Datum pl_lolcode_call_handler(PG_FUNCTION_ARGS) { /* ... */ }
    37. FUNKSHUN KALL HANDLRZ  Get function information Form_pg_proc from server/catalog/pg_proc.h   Form_pg_proc procStruct; HeapTuple procTup; procTup = SearchSysCache(PROCOID, ObjectIdGetDatum(fcinfo- >flinfo->fn_oid), 0, 0, 0); if (!HeapTupleIsValid(procTup)) elog(ERROR, \"Cache lookup failed for procedure %u\", fcinfo->flinfo->fn_oid); procStruct = (Form_pg_proc) GETSTRUCT(procTup); ReleaseSysCache(procTup);
    38. FUNKSHUN KALL HANDLRZ Get function source   procsrcdatum = SysCacheGetAttr(PROCOID, procTup, Anum_pg_proc_prosrc, &isnull); if (isnull) elog(ERROR, \"Function source is null\"); proc_source = DatumGetCString (DirectFunctionCall1(textout, procsrcdatum));
    39. FUNKSHUN KALL HANDLRZ Get return type   typeTup = SearchSysCache(TYPEOID, ObjectIdGetDatum (procStruct->prorettype), 0, 0, 0); if (!HeapTupleIsValid(typeTup)) elog(ERROR, \"Cache lookup failed for type %u\", procStruct->prorettype); typeStruct = (Form_pg_type) GETSTRUCT(typeTup); resultTypeIOParam = getTypeIOParam(typeTup); fmgr_info_cxt(typeStruct->typinput, &flinfo, TopMemoryContext);
    40. FUNKSHUN KALL HANDLRZ Get arguments and their types   for (i = 0; i < procStruct->pronargs; i++) { snprintf(arg_name, 255, \"LOL%d\", i+1); lolDeclareVar(arg_name); LOLifyDatum(fcinfo->arg[i], fcinfo->argnull[i], procStruct->proargtypes.values[i], arg_name); } LOLifyDatum() gets the typeStruct for the argument type, gets the argument value as a C string, and sticks it in PL/LOLCODE's variable table
    41. FUNKSHUN KALL HANDLRZ Pass the function source to the interpreter   pllolcode_yy_scan_string(proc_source); pllolcode_yyparse(NULL); pllolcode_yylex_destroy();   More on this later
    42. FUNKSHUN KALL HANDLRZ Return the proper value if (returnTypeOID != VOIDOID) { if (returnVal->type == ident_NOOB) fcinfo->isnull = true; else { if (returnTypeOID == BOOLOID) retval = InputFunctionCall(&flinfo, lolVarGetTroof(returnVal) == lolWIN ? \"T\" : \"F\", resultTypeIOParam, -1); else { /* ... */ retval = InputFunctionCall(&flinfo, rettmp, resultTypeIOParam, -1); } } }
    43. FUNKSHUN KALL HANDLRZ Note the function manager interface (fmgr.[h|c]) InputFunctionCall Calls a previously determined datatype input function OutputFunctionCall Opposite of InputFunctionCall DirectFunctionCall[1..9] Call a specific, named function with various numbers of arguments Require FmgrInfo structures to describe the functions to be called Read through fmgr.h and fmgr.c for more details
    44. CHEEZBURGER   SPEEK LOLCODE FUNKSHUN KALL HANDLRZ MAEK UN INTURPRETR ANATUMMY OF A PL
    45. MAEK UN INTURPRETR Don't write the parser on your own. It's painful It's a lot of work It's easy to do wrong Someone else already has Writing the interpreter on your own is ok, but use a tool I wrote an LOLCODE interpreter to ... Learn how Be sure it was licensed the way I wanted it licensed (BSD)
    46. MAEK UN INTURPRETR Tools for writing parsers and interpreters There are a fair number of them ANTLR Lex + Yacc Flex + Bison Parsers, language definitions, etc. are complex business Top-down vs. Bottom-up, Associativity rules... I chose Flex + Bison, because that's what PostgreSQL uses I've no particular expertise to choose based on merits I forgot most of my formal languages class :)
    47. MAEK UN INTURPRETR Parser generation tools require a language definition Generally in a variant of BNF (Backus-Naur Form)   <symbol1> ::= <expression> <postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-jr-part> <EOL> | <personal-part> <name-part> <personal-part> ::= <first-name> | <initial> \".\" <street-address> ::= <opt-apt-num> <house-num> <street-name> <EOL> <zip-part> ::= <town-name> \",\" <state-code> <ZIP-code> <EOL> <opt-jr-part> ::= \"Sr.\" | \"Jr.\" | <roman-numeral> | \"\"Thanks, Wikipedia
    48. MAEK UN INTURPRETR Flex + Bison == two parts  Flex = lexer Divides input into a set of labeled tokens Bison = parser Given a stream of tokens, interpret them and act accordingly
    49. MAEK UN INTURPRETR Flex input is mostly straightforward     /* Arithmetic functions */ \"SUM OF\" { return SUMOF; } \"DIFF OF\" { return DIFFOF; } \"PRODUKT OF\" { return PRODUKTOF; } \"QUOSHUNT OF\" { return QUOSHUNTOF; }\\ Some regular expressions [A-Za-z][A-Za-z0-9_]+ { return IDENTIFIER; }
    50. MAEK UN INTURPRETR State machine allows for processing blocks   OBTW { BEGIN LOL_BLOCKCMT; } <LOL_BLOCKCMT>TLDR { BEGIN 0; lol_lex_debug(\"yylex: block comment\"); } <LOL_BLOCKCMT>\\n { lineNumber++; } <LOL_BLOCKCMT>. {}
    51. MAEK UN INTURPRETR Some code to make memory allocation work   #define YYSTACK_FREE pfree #define YYSTACK_MALLOC palloc #define YYFREE pfree #define YYMALLOC palloc ... %option noyyalloc noyyrealloc noyyfree PostgreSQL uses memory contexts, so the parser has to use them too.
    52. MAEK UN INTURPRETR Flex makes a big stream of tokens Bison turns them into something useful   Define the tokens:   %token HAI KTHXBYE EXPREND %token YARN IDENTIFIER NUMBR NUMBAR %token TROOFWIN TROOFFAIL TROOF Define grammar and associated actions %% lol_program: HAI EXPREND lol_cmdblock KTHXBYE EXPREND { executeProgram($3); }
    53. MAEK UN INTURPRETR Bison's job (in this case) is to build a parse tree We don't actually execute anything until the tree is built In some future day, we'll save the parse tree after the first execution, and avoid re-parsing each time a function is called Once the tree is built, we'll pass it to a function that will execute it Specifically, the data structure is a linked list where each node can point to the next node in the list as well as to a new list This is similar to the structure PL/pgSQL uses
    54. MAEK UN INTURPRETR typedef struct lolNodeList { int length; lolNode *head; lolNode *tail; } lolNodeList; struct lolNode { NodeType node_type; struct lolNode *next; /* list for sub-nodes */ lolNodeList *list; NodeVal nodeVal; };
    55. MAEK UN INTURPRETR Each lolNode is a particular statement VISIBLE ORLY IHASA ... Also, constants and variables are nodes YARN NUMBAR ... How each node uses the members of the lolNode struct depends on the node
    56. MAEK UN INTURPRETR TROOF node Really simple -- handles a TROOF (Boolean) constant NodeType node_type = TROOF NodeVal nodeVal = WIN or FAIL When executed, this node simply sets IT to the WIN or FAIL value stored in nodeVal
    57. MAEK UN INTURPRETR VISIBLE nodes Remember, VISIBLE rasies an exception Structure members: NodeType node_type = VISIBLE lolNodeList *list = pointer to a list of nodes containing: A constant string (a 1-node list) An expression evaluating to a string (possibly many nodes) NodeVal nodeVal = a number representing the message level (NOTICE, WARNING, etc.) When executed, this node runs the list, which puts a value in IT, and raises IT as an exception
    58. MAEK UN INTURPRETR SUMOF node SUM OF x AN y = x + y Structure members: lolNodeList *list == a two-element list, where each node represents an operand The result is returned in IT
    59. MAEK UN INTURPRETR ORLY node O RLY? == if, YA RLY == then, MEBBE == else if, NO WAI == else lolNodeList *list points to a set of nodes, one per YA RLY, MEBBE, and NO WAI block. Each YA RLY, MEBBE, and NO WAI block has a sub-list of its own, containing the instructions in the block MEBBE blocks also contain a list of nodes containing the MEBBE expression
    60. MAEK UN INTURPRETR So what does Bison need to make this work?
    61. MAEK UN INTURPRETR Token definitions These describe the data Flex generates   %token FOUNDYR IHASA ITZ R %token SUMOF DIFFOF PRODUKTOF QUOSHUNTOF MODOF BIGGROF SMALLROF AN %token BOTHOF EITHEROF WONOF NOT ALLOF ANYOF MKAY %token BOTHSAEM DIFFRINT These will turn into #define statements (e.g. #define DIFFRINT 123)
    62. MAEK UN INTURPRETR A definition of the grammar   lol_cmd: lol_expression EXPREND { $$ = $1; } | ORLY EXPREND lol_orly_block OIC EXPREND { $$ = lolMakeNode(ORLY, tmpnval, $3); } | WTF EXPREND lol_wtf_block OIC EXPREND { $$ = lolMakeNode(WTF, tmpnval, $3); } | IMINYR IDENTIFIER EXPREND lol_cmdblock IMOUTTAYR IDENTIFIER EXPREND { /* TODO: Check $2 and $7 for equality */ $$ = lolMakeNode(IMINYR, tmpnval, $4); }
    63. MAEK UN INTURPRETR A definition of the data type used in Bison's stack   %union { int numbrVal; double numbarVal; char *yarnVal; struct lolNodeList *nodeList; struct lolNode *node; } This is pretty much always a union
    64. MAEK UN INTURPRETR Type declarations to map various declared expressions to union members   %type <nodeList> lol_program lol_cmdblock lol_orly_block %type <node> lol_const lol_var lol_cmd lol_xaryoprgroup %type <yarnVal> YARN %type <yarnVal> IDENTIFIER
    65. MAEK UN INTURPRETR As the parser sees token streams it can interpret as a particular object in the grammar, it runs the code associated with that object, using $1, $2, etc. to identify objects on the stack   lol_expression: ... | IHASA lol_var ITZ lol_expression { $$ = lolMakeNode (DECL, ($2)->nodeVal, lolMakeList($4)); }
    66. MISELANEEUS DEVELOPMENT   ENVIRONMENT INTRO TO SPI
    67. SPI GIMMEH <var> OUTTA DATABUKKIT \"<query>\" Note that this only returns one value PL/LOLCODE uses a special parse tree node for SPI The expression that generates the query is stored in the node's sub-list Executing that list puts the text of the query in IT SPI sends that query to the server for processing The first column of the first row is stuck in the variable in the GIMMEH statement SPIval = SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1); LOLifyString(SPIval, SPI_gettypeid(SPI_tuptable->tupdesc, 1), node- >nodeVal.yarnVal);
    68. Datums Everything is a Datum What a Datum actually is is architecture-dependant sizeof(Datum) == sizeof(long) >= sizeof(void *) >= 4 Different data types are stored in Datums in different ways You need to know what data type the datum is supposed to store SPI_tuptable is a global pointer to a structure that describes the last tuple returned by SPI Based on a Datum and knowledge of the type it represents, you can get useful values from it There are lots of functions, macros, etc. for processing Datums
    69. Development Environment Configure with --enable-debug, --enable-cassert  --enable-debug Include debugging symbols in final library This allows things like gdb to know where in the program you are, and tell you about it Disables optimizations Makes executables bigger --enable-cassert Adds (in)sanity checks Makes things slower More likely to tell you if you do something really stupid cf. memory context problems
    70. I CAN HAZ NAP NOW?
    71. U HAZ KWESCHUNZ?
    72. KTHXBYE
    73. Development Environment jtolley@uber:~/devel/pgsql$ ps -ef | grep postgres jtolley 7465 6283 1 13:23 ? 00:00:00 postgres: jtolley jtolley [local] idle jtolley@uber:~/devel/pgsql$ gdb postgres 7465 GNU gdb 6.8-debian ... Attaching to program: /home/jtolley/devel/pgdb/bin/postgres, process 7465 Reading symbols from /usr/lib/libxml2.so.2...done. Loaded symbols for /usr/lib/libxml2.so.2 ... Loaded symbols for /home/jtolley/devel/pgdb/lib/postgresql/pllolcode.so 0xb8003430 in __kernel_vsyscall () (gdb)
    SlideShare Zeitgeist 2009

    + Joshua DrakeJoshua Drake Nominate

    custom

    888 views, 1 favs, 2 embeds more stats

    Presentation presented by Joshua Tolley and Postgre more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 888
      • 799 on SlideShare
      • 89 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 0
    Most viewed embeds
    • 58 views on http://www.commandprompt.com
    • 31 views on http://www.enricopirozzi.info

    more

    All embeds
    • 58 views on http://www.commandprompt.com
    • 31 views on http://www.enricopirozzi.info

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories