Open Connectivity: BI, Integration and Applications on Couchbase Using ODBC and JDBC: Couchbase Connect 2015


Couchbase and N1QL give you unprecedented power to access and analyze your data, to pick out insights, and to find trends. But to present this information, you still need a BI tool. This presentation will give a brief overview of the JDBC and ODBC drivers for Couchbase, along with quick demos of how to connect to popular BI tools. We will then dive more in-depth by learning how to code for both the ODBC and JDBC drivers for your own applications.

  1. 1. Open Connectivity BI, Integration, and Apps on Couchbase using ODBC and JDBC June, 2015
  2. 2. • Who I am • Driver(s) Overview • Relational Model • Demos • ODBC in-depth • JDBC in-depth • QA Contents
  3. 3. • Worked in the data access space for eight years and counting • ODBC, OLEDB, JDBC, ADO.NET, ODBO, XMLA • Core developer for the current generation of Simba’s data access technologies • Collaborated at an Engineering Level with Simba ISV Customers to design and implement data drivers that are today being shipped world wide Kyle at a glance
  4. 4. • Simba connects people to data. • HQ’ed in Vancouver, BC. • 100ish employees. • Founded in 1991. • In 1992, Simba co-authored the original ODBC standard with Microsoft. • Simba produces the SimbaEngine® SDK and drivers for the leading data sources on multiple platforms. Simba Technologies at a glance
  5. 5. Simba Technologies at a glance
  6. 6. • Partnership to create read/write ODBC and JDBC drivers • ODBC 3.80 • JDBC 4.0 and 4.1 • Allow easy access to data within Couchbase from your favourite BI and ETL tools Why is Simba here?
  7. 7. What is an ODBC / JDBC Driver? N1QL mode to allow easy and advanced analytics
  8. 8. • Couchbase is NoSQL • Dynamic schema, documents vary within a bucket • ODBC and JDBC are SQL • Expect a fixed schema, each column is one type • Must map from dynamic schema data to fixed schema data Schema(less)
  9. 9. • SQL • Catalog, Schema, Table • Couchbase • Namespace, Keyspace Schema => Namespace Table => Keyspace (sort of) Relational Mapping
  10. 10. Sample JSON Document: {“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]} Simple Flattening Id Name Values[0] Values[1] 1 Couchbase V1 V2
  11. 11. Sample JSON Document: {“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]} Parent Child Re-Normalization Id Name 1 Couchbase Id Index Value 1 0 V1 1 1 V2
  12. 12. Demos
  13. 13. • C API • Versions: 2.x, 3.0, 3.52, 3.80, etc… • Non-Windows platforms are ODBC 3.52 • Driver Managers • Windows, iODBC, unixODBC, etc… • All functions have return codes • SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_ERROR, etc… ODBC Technicals
  14. 14. • SQLHENV, SQLHDBC, SQLHSTMT, SQLHDESC • Relationship is one-to-many ODBC: Handles
  15. 15. • Allocate with SQLAllocHandle, free with SQLFreeHandle • Ensure you set the version of ODBC in use SQLSetEnvAttr( hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3_80, SQL_IS_INTEGER) ODBC: Environment
  16. 16. • Allocate with SQLAllocHandle, created from SQLHENV • Maintains the actual connection to Couchbase • Create child statement objects to do work • Disconnect with SQLDisconnect, free with SQLFreeHandle ODBC: Connection
  17. 17. Open a connection using SQLDriverConnect SQLDriverConnect( hDbc, windowHandle, connStr, SQL_NTS, connStrLen &outStrLen, SQL_DRIVER_COMPLETE) Last parameter allows driver to prompt for information if connStr doesn’t contain all necessary information. ODBC: Connection
  18. 18. • Specify a Driver or DSN • Driver: Must specify all options in connection string • DSN: Can specify options in connection string Example: “DSN=Couchbase;UID=kylep;PWD=testPassword;” ODBC: Connection String
  19. 19. • Allocate with SQLAllocHandle, created from SQLHSTMT • Used for issuing queries, retrieving catalog metadata • Free with SQLFreeHandle ODBC: Statement
  20. 20. • Use SQLExecDirect for one-off queries • Use SQLPrepare and SQLExecute for repeated queries SQLExecDirect(hStmt, “<query>”, SQL_NTS) or SQLPrepare(hStmt, “<query>”, SQL_NTS) SQLExecute(hStmt) ODBC: Querying
  21. 21. • After execution, there is a cursor on the results, positioned before the first row • Use SQLFetch to move the cursor • SQLGetData can be used to fetch cell by cell • SQLBindCol can also be used, much more efficient ODBC: Results
  22. 22. • Reuse connections so driver caches are effective • Use SQLBindCol over SQLGetData • Use array fetches with SQLBindCol • Use SQLPrepare once, SQLExecute multiple times with parameters for loading data • Use parameter arrays when binding parameters with SQLBindParameter • Bind types to match reported parameter or column types ODBC: Performance Tips
  23. 23. SQLGetData vs. SQLBindCol 10 million rows, 3 columns of wide char, integer, decimal values ODBC: Performance Tips Method Time (s) SQLGetData 141.428 SQLBindCol 6.102 SQLBindCol (array[100]) 2.747
  24. 24. • Is the Java version of ODBC • Versions: 3.0, 4.0, 4.1, etc… • JDBC version is tied to Java version • Simba will supply JDBC 4.0 and 4.1 versions • There is a driver manager, but role is very limited • Errors are reported via exceptions, warnings via getWarnings() JDBC Technicals
  25. 25. • Relationship is again one-to-many JDBC: Object Hierarchy
  26. 26. • Never used directly by your code • Must be loaded by referencing using Class.forName() • FQCNs • com.simba.couchbase.jdbc4.Driver • com.simba.couchbase.jdbc41.Driver • URL • “jdbc:couchbase://<host>:<port>/<schema>;UseN1QLMode=0/1” JDBC: Driver
  27. 27. • Example String url = “jdbc:couchbase://localhost:8093/default;” Class.forName(“com.simba.couchbase.jdbc4.Driver”); Connection con = DriverManager.getConnection(url); • With JDBC 4.0 and later, Class.forName() can be omitted JDBC: Driver
  28. 28. • Used directly by your code • FQCNs • com.simba.couchbase.jdbc4.DataSource • com.simba.couchbase.jdbc41.DataSource • Can be used to programmatically set connection properties by using functions instead of a connection string • Allows use of advanced features such as PooledConnection • Used less often than Driver JDBC: DataSource
  29. 29. • Can create child statement objects for queries • Can create DatabaseMetaData objects for metadata via getMetaData() • Common pitfalls • Not closing connections; use a finally block to ensure it is closed • Not checking warnings, ever; call getWarnings() regularly • Creating a new connection for every operation; reuse connections where possible JDBC: Connection
  30. 30. • DatabaseMetaData • Created by Connection objects • Actually only one object per connection, cached and reused • Provides access to catalog metadata • getCatalogs(), getTables(), getColumns(), etc… • Provides access to database metadata • getDatabaseProductVersion(), getIdentifierQuoteString(), etc… JDBC: Database MetaData
  31. 31. • Created by Connection objects • Three different types • Statement – for one-off querying • PreparedStatement – for queries with parameters • CallableStatement – for stored procedures with output parameters* • Statement objects will eventually dispose of themselves once out of scope, but best practice is to close() them when done JDBC: Statement Objects
  32. 32. • Cannot use parameters • Use execute(), executeQuery(), or executeUpdate() to execute SQL or N1QL queries JDBC: Statement
  33. 33. execute() Example Statement stmt = conn.createStatement(); try { if (stmt.execute(“select * from beer-sample”)) { ResultSet rs = stmt.getResultSet(); rs.close(); } } finally { stmt.close(); } JDBC: Statement
  34. 34. • For use with parameters • Can get metadata about results before execution with getResultSetMetaData() • Can get metadata about parameters using getParameterMetaData() • Use set*() functions to provide parameter values JDBC: PreparedStatement
  35. 35. • When loading data, use batches • Set all required parameters for one execution • Call addBatch() to add the current set of parameters • Call executeBatch() to execute all added batches at once • Set parameters as reported types to avoid conversion overhead in the driver • Reuse the statement for multiple executions JDBC: PreparedStatement
  36. 36. • Represents query results or catalog metadata • Describe the result set using getMetaData() • Move through result using next() • Can use isAfterLast(), isFirst(), isLast(), etc. to check cursor position. • Retrieve cell values using get*() methods • The driver supports all conversions between types listed by the JDBC spec • Try to retrieve as requested type to avoid conversion overhead • Remember to check wasNull() after calling get*() method JDBC: ResultSet
  37. 37. ResultSet Example ResultSet rs = stmt.executeQuery(“<query>”); try { int numColumns = rs.getMetaData().getColumnCount(); while ( { for (int i = 0; i < numColumns; ++i) { System.out.println(rs.getString(i)); } } } finally { rs.close(); } JDBC: ResultSet
  38. 38. Q & A
  39. 39.