Relational Database Access with Python ‘sans’ ORM


Published on

Slides from my PyCon APAC 2012 talk in Singapore

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • For some Python programmers, their only exposure to accessing relational data is via a object relational mapper (ORM). As powerful is the concept of mapping objects to data, sometimes it is much simpler to manipulate your relational data using SQL. This talk will be about using the DB-API, Python’s standard mechanism for accessing relational databases.
  • Or maybe you prefer sqlalchemy to abstract away the database. This talk will be about using the DB-API, Python’s standard mechanism for accessing relational databases.
  • SQL (Structured Query Language) is a DSL and we can achieve the same results as the previous two slides. This what DBA’s program in. 
  • This diagram no longer seems toexist on Travis’s site
  • Always use parameter binding. Why? * you normally get better performance from some database engines due to to SQL query caching * reduce the chance of SQL injection
  • Always use parameter binding. Why? * you normally get better performance from some database engines due to to SQL query caching * reduce the chance of SQL injection
  • Gerald is a general purpose database schema toolkit written in Python. It can be used for cataloguing, managing and deploying database schemas. It is designed to allow you to easily identify the differences between databases.
  • SQLPython is a command-line interface to relational databases written in Python. It was created as an alternative to Oracle’s SQL\\*Plus, and can likewise be used instead of postgres’ psql or mysql’smysql text clients. In addition, it offers several extra features inspired by other command-line clients: Neatened output, smart prompt, tab completion, history, scripting, output to file, paste buffer & os command, unix like commands – ls cat grep, data dictionary exploration. Another feature is special output formats. By replacing the ; that terminates a SELECT statement with a backslash-character sequence, you can get output in a number of useful formats like xml, json, csvetc
  • One of the most powerful features is the py command. The py command allows the user to execute Python commands, either one-at-a-time (with py {command}) or in an interactive environment (beginning with a bare py statement, and continuing until Ctrl-D, quit(), or exit() is entered). A history of result sets from each query is exposed to the python session as the list r; the most recent result set is r[-1]. Each row can be references as a tuple, or as an object with an attribute for each column.
  • Spring Python takes the concepts of the Spring Framework and Spring Security, and brings them to the world of Python. It isn't a simple line-by-line port of the code. Instead, it takes some powerful ideas that were discovered in the realm of Java, and pragmatically applies them in the world of Python.One of these paradigms is a Portable Service Abstraction called DatabaseTemplate. * It is portable because it uses Python's standardized API, not tying us to any database vendor. Instead, in our example, we injected in an instance of Sqlite3ConnectionFactory* It provides the useful service of easily accessing information stored in a relational database, but letting us focus on the query, not the plumbing code* It offers a nice abstraction over Python's low level database API with reduced code noise. This allows us to avoid the cost and risk of writing code to manage cursors and exception handlingDatabaseTemplate handles exceptions by catching and holding them, then properly closing the cursor. It then raises it wrapped inside a Spring Python DataAccessException. This way, database resources are properly disposed of without losing the exception stack trace.The Database Template can be used in isolation from the SpringPython framework.
  • Relational Database Access with Python ‘sans’ ORM

    1. 1. Relational Database Access with Python „sans‟ ORM Mark Rees CTO Century Software (M) Sdn. Bhd.
    2. 2. Your Current Relational Database Access Style?# Django ORM>>> from ip2country.models import Ip2Country>>> Ip2Country.objects.all()[<Ip2Country: Ip2Country object>, <Ip2Country: Ip2Country object>, ...(remainingelements truncated)...]>>> sgp = Ip2Country.objects.filter(assigned__year=2012)... .filter(countrycode2=SG)>>> sgp[0].ipfrom1729580032.0
    3. 3. Your Current Relational Database Access Style?# SQLAlchemy ORM>>> from sqlalchemy import create_engine, extract>>> from sqlalchemy.orm import sessionmaker>>> from models import Ip2Country>>> engine =create_engine(postgresql://ip2country_rw:secret@localhost/ip2country)>>> Session = sessionmaker(bind=engine)>>> session = Session()>>> all_data = session.query(Ip2Country).all()>>> sgp = session.query(Ip2Country).... filter(extract(year,Ip2Country.assigned) == 2012).... filter(Ip2Country.countrycode2 == SG)print sgp[0].ipfrom1729580032.0
    4. 4. SQL Relational Database AccessSELECT * FROM ip2country;"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729522688;1729523711;"apnic";"2011-08-05";"CN";"CHN";"China"1729523712;1729524735;"apnic";"2011-08-05";"CN";"CHN";"China”...SELECT * FROM ip2countryWHERE date_part(year, assigned) = 2012AND countrycode2 = SG;"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729580032;1729581055;"apnic";"2012-01-16";"SG";"SGP";"Singapore"1729941504;1729942527;"apnic";"2012-01-10";"SG";"SGP";"Singapore”...SELECT ipfrom FROM ip2countryWHERE date_part(year, assigned) = 2012AND countrycode2 = SG;"ipfrom"17295800321729941504...
    5. 5. Python + SQL == Python DB-API 2.0• The Python standard for a consistent interface to relational databases is the Python DB-API (PEP 249)• The majority of Python database interfaces adhere to this standard
    6. 6. Python DB-API UML Diagram
    7. 7. Python DB-API Connection ObjectAccess the database via the connection object• Use connect constructor to create a connection with database conn = psycopg2.connect(parameters…)• Create cursor via the connection cur = conn.cursor()• Transaction management (implicit begin) conn.commit() conn.rollback()• Close connection (will rollback current transaction) conn.close()• Check module capabilities by globals psycopg2.apilevel psycopg2.threadsafety psycopg2.paramstyle
    8. 8. Python DB-API Cursor ObjectA cursor object is used to represent a databasecursor, which is used to manage the context offetch operations.• Cursors created from the same connection are not isolated cur = conn.cursor() cur2 = conn.cursor()• Cursor methods cur.execute(operation, parameters) cur.executemany(op,seq_of_parameters) cur.fetchone() cur.fetchmany([size=cursor.arraysize]) cur.fetchall() cur.close()
    9. 9. Python DB-API Cursor Object• Optional cursor methods cur.scroll(value[,mode=relative]) cur.callproc(procname[,parameters]) cur.__iter__()• Results of an operation cur.description cur.rowcount cur.lastrowid• DB adaptor specific “proprietary” cursor methods
    10. 10. Python DB-API Parameter StylesAllows you to keep SQL separate from parametersImproves performance & securityWarning Never, never, NEVER use Python stringconcatenation (+) or string parametersinterpolation (%) to pass variables to a SQL querystring. Not even at gunpoint.From
    11. 11. Python DB-API Parameter StylesGlobal paramstyle gives supported style for theadaptor qmark Question mark style WHERE countrycode2 = ? numeric Numeric positional style WHERE countrycode2 = :1 named Named style WHERE countrycode2 = :code format ANSI C printf format style WHERE countrycode2 = %s pyformat Python format style WHERE countrycode2 = %(name)s
    12. 12. Python + SQL: INSERTimport csv, datetime, psycopg2conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret”)cur = conn.cursor()with open("IpToCountry.csv", "rb") as f: reader = csv.reader(f) try: for row in reader: print row if row[0][0] != "#": row[3] = datetime.datetime.utcfromtimestamp(float(row[3])) cur.execute("""INSERT INTO ip2country( ipfrom, ipto, registry, assigned, countrycode2, countrycode3, countryname) VALUES (%s, %s, %s, %s, %s, %s, %s)""", row) except: conn.rollback() else: conn.commit() finally: cur.close() conn.close()
    13. 13. Python + SQL: SELECT# Find ipv4 address ranges assigned to Singaporeimport psycopg2, socket, structdef num_to_dotted_quad(n): """convert long int to dotted quad string""" return socket.inet_ntoa(struct.pack(!L,n))conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret")cur = conn.cursor()cur.execute("""SELECT * FROM ip2country WHERE countrycode2 = SG ORDER BY ipfrom""")for row in cur: print "%s - %s" % (num_to_dotted_quad(int(row[0])), num_to_dotted_quad(int(row[1])))
    14. 14. SQLite• sqlite3 • CPython 2.5 & 3 • DB-API 2.0 • Part of CPython distribution since 2.5
    15. 15. PostgreSQL• psycopg • CPython 2 & 3 • DB-API 2.0, level 2 thread safe • Appears to be most popular •• py-postgresql • CPython 3 • DB-API 2.0 • Written in Python with optional C optimizations • pg_python - console •
    16. 16. PostgreSQL• PyGreSQL • CPython 2.3+ • Classic & DB-API 2.0 interfaces • • Last release 2009• pyPgSQL • CPython 2 • Classic & DB-API 2.0 interfaces • • Last release 2006
    17. 17. PostgreSQL• pypq • CPython 2.7 & pypy 1.7+ • Uses ctypes • DB-API 2.0 interface • psycopg2-like extension API •• psycopg2ct • CPython 2.6+ & pypy 1.6+ • Uses ctypes • DB-API 2.0 interface • psycopg2 compat layer • pg2-ctypes
    18. 18. MySQL• MySQL-python • CPython 2.3+ • DB-API 2.0 interface • python/• PyMySQL • CPython 2.4+ & 3 • Pure Python DB-API 2.0 interface •• MySQL-Connector • CPython 2.4+ & 3 • Pure Python DB-API 2.0 interface •
    19. 19. Other “Enterprise” Databases• cx_Oracle • CPython 2 & 3 • DB-API 2.0 interface •• informixda • CPython 2 • DB-API 2.0 interface • • Last release 2007• Ibm-db • CPython 2 • DB-API 2.0 for DB2 & Informix •
    20. 20. ODBC• mxODBC • CPython 2.3+ • DB-API 2.0 interfaces • n/mxODBC/doc • Commercial product• PyODBC • CPython 2 & 3 • DB-API 2.0 interfaces with extensions •• ODBC interfaces not limited to Windows thanks to iODBC and unixODBC
    21. 21. Jython + SQL• zxJDBC • DB-API 2.0 Written in Java using JDBC API so can utilize JDBC drivers • Support for connection pools and JNDI lookup • Included with standard Jython installation• jyjdbc • DB-API 2.0 compliant • Written in Python/Jython so can utilize JDBC drivers • Decimal data type support •
    22. 22. IronPython + SQL• adodbapi • IronPython 2+ • Also works with CPython 2.3+ with pywin32 •
    23. 23. Gerald, the half a schema • Database schema toolkit • via DB-API currently supports • PostgreSQL • MySQL • Oracle • geralds1 = gerald.PostgresSchema(’public, postgres://ip2country_rw:secret@localhost/ip2country)s2 = gerald.PostgresSchema(’public, postgres://ip2country_rw:secret@localhost/ip2countryv4)print s1.schema[ip2country].compare(s2.schema[ip2country])DIFF: Definition of assigned is differentDIFF: Column countryname not in ip2countryDIFF: Definition of registry is differentDIFF: Column countrycode3 not in ip2countryDIFF: Definition of countrycode2 is different
    24. 24. SQLPython • A command-line interface to relational databases • via DB-API currently supports • PostgreSQL • MySQL • Oracle •$ sqlpython --postgresql ip2country ip2country_rwPassword:0:ip2country_rw@ip2country> select * from ip2country where countrycode2=SG;...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore551 rows selected.0:ip2country_rw@ip2country> select * from ip2country where countrycode2=SGj[...{"ipfrom": 1728830464.0, "ipto": 1728830719.0, "registry": "apnic”,"assigned": "2011-11-02","countrycode2": "SG", "countrycode3": "SGP", "countryname": "Singapore"}]
    25. 25. SQLPython, batteries included0:ip2country_rw@ip2country> select * from ip2country where countrycode2 =SG’;...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore551 rows selected.0:ip2country_rw@ip2country> pyPython 2.6.6 (r266:84292, May 20 2011, 16:42:25)[GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2 py <command>: Executes a Python command. py: Enters interactive Python mode. End with `Ctrl-D` (Unix) / `Ctrl-Z` (Windows), `quit()`, exit()`. Past SELECT results are exposed as list `r`; most recent resultset is `r[-1]`. SQL bind, substitution variables are exposed as `binds`, `substs`. Run python code from external files with ``run("")``>>> r[-1][-1](1728830464.0, 1728830719.0, apnic,, 11, 2), SG, SGP, Singapore)>>> import socket, struct>>> def num_to_dotted_quad(n):... return socket.inet_ntoa(struct.pack(!L,n))...>>> num_to_dotted_quad(int(r[-1][-1].ipfrom))
    26. 26. SpringPython – Database Templates# Find ipv4 address ranges assigned to Singapore# using SpringPython DatabaseTemplate & DictionaryRowMapperfrom springpython.database.core import *from springpython.database.factory import *conn_factory = PgdbConnectionFactory( user="ip2country_rw", password="secret", host="localhost", database="ip2country")dt = DatabaseTemplate(conn_factory)results = dt.query( "SELECT * FROM ip2country WHERE countrycode2=%s", ("SG",), DictionaryRowMapper())for row in results: print "%s - %s" % (num_to_dotted_quad(int(row[ipfrom])), num_to_dotted_quad(int(row[ipto])))
    27. 27. AttributionsDB-API 2.0 PEP Spencer‟s DB-API UML Diagram Kuchlings introduction to the DB-API
    28. 28. AttributionsAndy Todd‟s OSDC paper of csv data used in examples fromWebNet77 licensed under GPLv3
    29. 29. Contact Details Mark Reesmark at centurysoftware dot com dot my +Mark Rees @hexdump42