Working with databases in Perl

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Working with databases in Perl - Presentation Transcript

    1. Working with databases in Perl Tutorial for YAPC.:EU:::2009, Lisbon [email_address] Département Office
    2. Overview
      • intended audience : beginners
        • in Perl
        • in Databases
      • main topics
        • Persistent storage
        • RDBMS
        • SQL
        • Perl DBI architecture
        • Usage and efficiency
        • Object-Relational Mappings
      • disclaimer
        • didn't have personal exposure to everything mentioned in this tutorial
    3. Further info
      • Database textbooks
      • DBI manual ( L<DBI >, L< DBI:.FAQ >, L< DBI::Profile >)
      • Book : &quot;Programming the DBI&quot;
      • Vendor's manuals
      • ORMs
        • DBIx::Class::Manual
        • DBIx::DataModel
       mastering databases requires a lot of reading !
    4. Persistent storage
      • RDBMS = Relational Database Management System
      • but maybe that's not what you want !
        • Other solutions for persistency :
          • BerkeleyDB : persistent hashes / arrays
          • Judy : persistent dynamic arrays / hashes
          • CouchDB : OO/hierarchical database (I think)
          • KiokuDB : persistent objects, front-end to BerkeleyDB / CouchDB / etc.
          • Plain Old File (using for example File::Tabular )
          • KinoSearch : bunch of fields with fulltext indexing
    5. Features of RDBMS
      • Relational
      • Indexing
      • Concurrency
      • Distributed
      • Transactions (commit / rollback )
      • Authorization
      • Triggers and stored procedures
      • Internationalization
      • Fulltext
    6. Choosing a RDBMS
      • Sometimes there is no choice (enforced by context) !
      • Criteria
        • cost, proprietary / open source
        • volume
        • features
        • resources (CPU, RAM, etc.)
        • ease of installation / deployment / maintenance
        • stored procedures
          • Postgres can have server-side procedures in Perl !
      • Common choices (open source)
        • SQLite (file-based)
        • mysql
        • Postgres
    7. Architecture Database DBD driver DBI Object-Relational Mapper Perl program TIOOWTDI There is only one way to do it TAMMMWTDI There are many, many many ways to do it TIMTOWTDI There is more than one way to do it
    8. DBD Drivers
        • Databases
          • Adabas DB2 DBMaker Empress Illustra Informix Ingres InterBase MaxDB Mimer Oracle Ovrimos PO Pg PrimeBase QBase Redbase SQLAnywhere SQLite Solid Sqlflex Sybase Unify mSQL monetdb mysql
        • Other kinds of data stores
          • CSV DBM Excel File iPod LDAP
        • Proxy, relay, etc
          • ADO Gofer JDBC Multi Multiplex ODBC Proxy SQLRelay
        • Fake, test
          • NullP Mock RAM Sponge
    9. When SomeExoticDB has no driver
      • Quotes from DBI::DBD :
          • &quot; The first rule for creating a new database driver for the Perl DBI is very simple: DON'T! &quot;
          • &quot; The second rule for creating a new database driver for the Perl DBI is also very simple: Don't -- get someone else to do it for you! &quot;
      • nevertheless there is good advice/examples
        • see DBI::DBD
      • Other solution : forward to other drivers
        • ODBC (even on Unix)
        • JDBC
        • SQLRelay
    10. Talking to a RDBMS
      • SQL : Standard Query Language. Except that
        • the standard is hard to find (not publicly available
        • not all vendors implement the full standard
        • most vendors have non-standard extensions
        • it's not only about queries
          • DML : Data Manipulation Language
          • DDL : Data Definition Language
    11. Writing SQL SQL is too low-level, I don't ever want to see it SQL is the most important part of my application, I won't let anybody write it for me
    12. Data Definition Language (DDL)
      • CREATE TABLE author (
      • author_id INTEGER PRIMARY KEY,
      • author_name VARCHAR(20),
      • e_mail VARCHAR(20),
      • );
      • CREATE/ALTER/DROP/RENAME
      • DATABASE
      • INDEX
      • VIEW
      • TRIGGER
    13. Modeling (UML) Author Distribution Module 1 * 1 * ► depends on * * ► contains
    14. Terminology Author Distribution Module 1 * 1 * ► depends on * * ► contains multiplicity association name class association composition
    15. Implementation author_id author_name e_mail 1 * 1 * * * Author distrib_id module_id Dependency distrib_id distrib_name d_release author_id Distribution module_id module_name distrib_id Module 1 1 link table for n-to-n association
    16. Various naming conventions for primary/foreign keys
      • author.author_id  distribution.author_id
          • RDBMS knows how to perform joins ( &quot;NATURAL JOIN&quot; )
      • author.id  distribution.author_id
          • ORM knows how to perform joins (RoR ActiveRecord)
          • SELECT * FROM table1, table2 ….  which id ?
      • author.id  distribution.author
          • $a_distrib->author() : foreign key or related record ?
       columns for joins should always be indexed
    17. Data Manipulation Language (DML)
      • SELECT author_name, distribution_name
      • FROM author INNER JOIN distribution
      • ON author.author_id = distribution.author_id
      • WHERE distribution_name like 'DBD::%';
      • INSERT INTO author ( author_id, author_name, e_mail )
      • VALUES ( 123, 'JFOOBAR', 'john@foobar.com' );
      • UPDATE author
      • SET e_mail = 'john@foobar.com'
      • WHERE author_id = 3456;
      • DELETE FROM author
      • WHERE author_id = 3456;
    18. Best practice : placeholders
      • SELECT author_name, distribution_name
      • FROM author INNER JOIN distribution
      • ON author.author_id = distribution.author_id
      • WHERE distribution_name like ? ;
      • INSERT INTO author ( author_id, author_name, e_mail )
      • VALUES ( ? , ? , ? );
      • UPDATE author
      • SET e_mail = ?
      • WHERE author_id = ? ;
      • DELETE FROM author
      • WHERE author_id = ? ;
      •  no type distinction (int/string)  statements can be cached
      •  avoid SQL injection problems
        • SELECT * FROM foo
        • WHERE val = $x ;
        • $x eq '123; DROP TABLE foo'
      • sometimes other syntax (for ex. $1, $2)
    19. DBI API
      • handles
        • the whole package (DBI)
        • driver handle ($dh)
        • database handle ($dbh)
        • statement handle ($sth)
      • interacting with handles
        • objet-oriented
          • ->connect(…), ->prepare(…), ->execute(...), …
        • tied hash
          • ->{AutoCommit}, ->{NAME_lc}, ->{CursorName}, …
    20. Connecting
      • my $dbh = DBI-> connect ($connection_string);
      • my $dbh = DBI-> connect ($connection_string,
      • $user,
      • $password,
      • { %attributes } );
      • my $dbh = DBI-> connect_cached ( @args );
    21. Some dbh attributes
      • AutoCommit
        • if true, every statement is immediately committed
        • if false, need to call
          • $dbh->begin_work();
          • … # inserts, updates, deletes
          • $dbh->commit();
      • RaiseError
        • like autodie for standard Perl functions : errors raise exceptions
      • see also
        • PrintError
        • HandleError
        • ShowErrorStatement
      • and also
        • LongReadLen
        • LongTrunkOK
        • RowCacheSize
       hash API : attributes can be set dynamically [ local ] $dbh->{$attr_name} = $val
      • peek at $dbh internals
      • DB<1> x $dbh  {}
      • D B <2 > x tied % $ d b h  {…}
    22. Lost connection
      • manual recover
          • if ($dbh->errstr =~ /broken connection/i) { … }
      • DBIx::RetryOverDisconnects
        • intercepts requests (prepare, execute, …)
        • filters errors
        • attemps to reconnect and restart the transaction
      • some ORMs have their own layer for recovering connections
      • some drivers have their own mechanism
          • $dbh->{mysql_auto_reconnect} = 1;
    23. Data retrieval
      • my $sth = $dbh-> prepare ($sql);
      • $sth-> execute ( @bind_values );
      • my @columns = @{$sth->{NAME}};
      • while (my $row_aref = $sth-> fetch ) {
      • }
      • # or
      • $dbh-> do ($sql);
      • see also : prepare_cached
    24. Other ways of fetching
      • single row
          • fetchrow_array
          • fetchrow_arrayref (a.k.a fetch)
          • fetchrow_hashref
      • lists of rows (with optional slicing)
          • fetchall_arrayref
          • fetchall_hashref
      • prepare, execute and fetch
          • selectall_arrayref
          • selectall_hashref
      • vertical slice
          • selectcol_arrayref
       little DBI support for cursors
    25. Efficiency
      • my $sth = $dbh->prepare(<<'');
      • SELECT author_id, author_name, e_mail
      • FROM author
      • my ($id, $name, $e_mail);
      • $sth->execute;
      • $sth-> bind_columns ( ($id, $name, $e_mail));
      • while ($sth->fetch) {
      • print &quot;author $id is $name at $e_mail &quot;;
      • }
       avoids cost of allocating / deallocating Perl variables  don't store a reference an reuse after another fetch
    26. Datatypes
      • NULL  undef
      • INTEGER, VARCHAR, DATE  perl scalar
        • usually DWIM works
        • if needed, can specify explicitly
          • $sth->bind_param($col_num, $value, SQL_DATETIME);
      • BLOB  perl scalar
      • ARRAY (Postgres)  arrayref
    27. Large objects
      • usually : just scalars in memory
      • when reading : control BLOB size
        • $dbh->{LongReadLen} = $max_bytes;
        • $dbh->{LongTrunkOK} = 1
      • when writing : can inform the driver
        • $sth->bind_param($ix, $blob, SQL_BLOB);
      • driver-specific stream API. Ex :
        • Pg : pg_lo_open, pg_lo_write, pg_lo_lseek
        • Oracle : ora_lob_read(…), ora_lob_write(…), ora_lob_append(…)
    28. Transactions
      • $dbh->{ RaiseError } = 1; # errors will raise exceptions
      • eval {
      • $dbh-> begin_work (); # will turn off AutoCommit
      • … # inserts, updates, deletes
      • $dbh-> commit ();
      • };
      • if ($@) {
      • my $err = $@;
      • eval {$dbh-> rollback ()};
      • my $rollback_result = $@ || &quot;SUCCESS&quot;;
      • die &quot;FAILED TRANSACTION : $err&quot;
      • . &quot;; ROLLBACK: $rollback_result&quot;;
      • }
      • encapsulated in DBIx::Transaction or ORMs
      • $schema-> transaction ( sub { …} );
      • nested transactions : must keep track of transaction depth
      • savepoint / release : only in DBIx::Class
    29. Locks and isolation levels
      • Locks on rows
        • shared
          • other clients can also get a shared lock
          • requests for exclusive lock must wait
        • exclusive
          • all other requests for locks must wait
      • Intention locks (on whole tables)
        • Intent shared
        • Intent exclusive
      • Isolation levels
        • read-uncommitted
        • read-committed
        • repeatable-read
        • serializable
      SELECT … FOR READ ONLY SELECT … FOR UPDATE SELECT … LOCK IN SHARE MODE LOCK TABLE(S) … READ/WRITE SET TRANSACTION ISOLATION LEVEL …
    30. Tracing / profiling
      • $dbh->trace($trace_setting, $trace_where)
        • 0 - Trace disabled.
        • 1 - Trace top-level DBI method calls returning with results or errors.
        • 2 - As above, adding tracing of top-level method entry with parameters.
        • 3 - As above, adding some high-level information from the driver and some internal information from the DBI.
      • $dbh->{Profile} = 2; # profile at the statement level
        • many powerful options
        • see L< DBI::Profile >
    31. Metadata
      • datasources
        • my @sources = DBI-> data_sources ($driver);
      • table_info
        • my $sth = $dbh-> table_info (@search_criteria);
        • while (my $row = $sth->fetchrow_hashref) {
        • print &quot;$row->{TABLE_NAME} : $row->{TABLE_TYPE} &quot;;
        • }
      • others
        • column_info()
        • primary_key_info()
        • foreign_key_info()
       many drivers only have partial implementations
    32. Stored procedures
      • my $sth = $dbh->prepare($db_specific_sql);
      • # prepare params to be passed to the called procedure
      • $sth-> bind_param (1, $val1);
      • $sth->bind_param(2, $val2);
      • # prepare memory locations to receive the results
      • $sth-> bind_param_inout (3, $result1);
      • $sth->bind_param_inout(4, $result2);
      • # execute the whole thing
      • $sth->execute;
    33. Cursors
      • my $sql = &quot;SELECT * FROM SomeTable FOR UPDATE &quot;;
      • my $sth1 = $dbh->prepare($sql);
      • $sth1->execute();
      • my $curr = &quot;WHERE CURRENT OF $sth1->{CursorName} &quot;;
      • while (my $row = $sth1->fetch) {
      • if (…) {
      • $dbh->do(&quot;D ELETE FROM SomeTable WHERE $curr&quot;);
      • } else {
      • my $sth2 = $dbh->prepare(
      • &quot;UPDATE SomeTable SET col = ? WHERE $curr&quot;);
      • $sth2->execute($new_val);
    34. Object-Relational Mapping (ORM) r1 r2 ... c1 c2 c3 ... c3 c4 +c1: String +c2: String +c3: class2 r1 : class1 RDBMS r2 : class1 RAM table1 table2
    35. ORM: What for ?
      • [catalyst list] On Thu, 2006-06-08, Steve wrote:
      • Not intending to start any sort of rancorous discussion,
      • but I was wondering whether someone could illuminate
      • me a little?
      • I'm comfortable with SQL, and with DBI. I write basic
      • SQL that runs just fine on all databases, or more
      • complex SQL when I want to target a single database
      • (ususally postgresql).
      • What value does an ORM add for a user like me?
    36. ORM useful for …
      • dynamic SQL
        • navigation between tables
        • generate complex SQL queries from Perl datastructures
        • better than phrasebook or string concatenation
      • automatic data conversions (inflation / deflation)
      • expansion of tree data structures coded in the relational model
      • transaction encapsulation
      • data validation
      • computed fields
      • caching
       See Also : http://lists.scsys.co.uk/pipermail/catalyst/2006-June
    37. Impedance mismatch
      • SELECT c1, c2 FROM table1
          •  missing c3 , so cannot navigate to class2
          • is it a valid instance of class1 ?
      • SELECT * FROM table1 LEFT JOIN table2 ON …
          •  what to do with the c4 column ?
          • is it a valid instance of class1 ?
      • SELECT c1, c2, length(c2) AS l_c2 FROM table1
          •  no predeclared method in class1 for accessing l_c2
      c1 c2 c3 c3 c4 +c1: String +c2: String +c3: class2 r1 : class1 RDBMS RAM table1 table2
    38. ORM Landscape
      • Leader
        • DBIx::Class (a.k.a. DBIC)
      • Also discussed here
        • DBIx::DataModel
      • Many others
        • Rose::DB, Jifty::DBI, Fey::ORM, ORM, DBIx::ORM::Declarative, Tangram, Coat::Persistent, …
    39. DBIx::Class Schema
      • package MyDatabase::Main;
      • use base qw/DBIx::Class::Schema/;
      • __PACKAGE__->load_namespaces;
      • package MyDatabase::Main::Result::Artist;
      • use base qw/DBIx::Class/;
      • __PACKAGE__->load_components(qw/PK::Auto Core/);
      • __PACKAGE__->table('artist');
      • __PACKAGE__->add_columns(qw/ artistid name /);
      • __PACKAGE__->set_primary_key('artistid');
      • __PACKAGE__->has_many('cds' =>
      • 'MyDatabase::Main::Result::Cd');
      • package ...
      • ...
    40. DBIx::Class usage
      • my $schema = MyDatabase::Main
      • ->connect('dbi:SQLite:db/example.db');
      • my @artists = (['Michael Jackson'], ['Eminem']);
      • $schema->populate('Artist', [
      • [qw/name/],
      • @artists,
      • ]);
      • my $rs = $schema->resultset('Track')->search(
      • {
      • 'cd.title' => $cdtitle
      • },
      • {
      • join => [qw/ cd /],
      • }
      • );
      • while (my $track = $rs->next) {
      • print $track->title . &quot; &quot;;
      • }
    41. DBIx::DataModel Schema
      • package MyDatabase;
      • use DBIx::DataModel;
      • DBIx::DataModel->Schema(__PACKAGE__)
      • ->Table(qw/Artist artist artistid/)
      • ->Table(qw/CD cd cdid /)
      • ->Table(qw/Track track trackid /)
      • ->Association([qw/Artist artist 1 /],
      • [qw/CD cds 0..* /])
      • ->Composition([qw/CD cd 1 /],
      • [qw/Track tracks 1..* /]);
    42. DBIx::DataModel usage
      • my $dbh = DBI->connect('dbi:SQLite:db/example.db');
      • MyDatabase->dbh($dbh);
      • my @artists = (['Michael Jackson'], ['Eminem']);
      • MyDatabase::Artist->insert(map { {name => $_ } } @artists);
      • my $statement = MyDatabase->join(qw/CD tracks/)->select(
      • -columns => [qw/track.title|trtitle …/],
      • -where => { 'cd.title' => $cdtitle },
      • -resultAs => 'statement', # default : arrayref of rows
      • );
      • while (my $track = $statement->next) {
      • print &quot;$track->{trtitle} &quot;;
      • }
    SlideShare Zeitgeist 2009

    + ldamildami Nominate

    custom

    374 views, 0 favs, 0 embeds more stats

    An overview of the main questions/design issues whe more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 374
      • 374 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 8
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags