This document advertises CData Power BI connectors which allow users to connect Power BI to over 110 live data sources using DirectQuery. The connectors treat APIs like databases by wrapping SQL queries around them and mapping API resources and objects to tables and views. This provides a uniform SQL experience across different data sources and allows for features like nested JSON values in columns, stored procedures, and relating sub-collections as tables. A demonstration connects to MongoDB and QuickBooks Desktop as examples.
This is CData Software Power BI Connectors – If you’re looking for latest (and fastest) way to connect to live data from more than 100 different sources, then you’re in the right spot!
CData Software is a leading provider of Data Connectivity solutions (we’ll dig into what that means in a bit), with roots in data connectivity going back to 1994.
My name is Jerod Johnson. I’m a technology evangelist with CData and I’ve been with the company for around 5 years.
In this session, I’ll give a general overview of the CData technology, introduce the CData Power BI Connectors, and discuss how they relate to connecting via ODBC.
Then we’ll spend the bulk of our time together in a demo of connecting to live data from Power BI using the CData connectors, starting with configuring a connection and ending with simple visualizations in Power BI.
Let’s start with the wow factor. CData Software offers connectors that enable live connectivity to over 110 different SaaS, Big Data, and NoSQL sources through a familiar SQL interface. Sources range from the enterprise and widely used like Salesforce and various MS Dynamics platforms to the niche sources like generic REST APIs or XML data. On this slide you can see almost all of our sources.
So what is it that CData software does? We make APIs look like databases. Data analysts and business app users are generally familiar with tabular presentation of data, if not with SQL itself. What our connectors do is provide a SQL interface to your data, no matter where it is. For our Power BI connectors, this means that whenever Power BI submits a SQL query to a data source, the connectors translate that query into the appropriate API or protocol-level request for the source. When the source responds, the connectors then translate the response into a table, with rows and columns.
How does CData make APIs look like databases?
We starts with a database metaphor for API data. Each table represents a set of entities or objects. Each row represents an individual entity of the given table, and individual columns represent attributes within that entity. So imagine a table of Dynamics CRM Leads, where each row is a lead and there are columns for attributes like name, email address, priority and all of the other Lead information the Dynamics CRM API exposes.
For the most part, the connectors are wrapped around REST/SOAP APIs, so SQL commands correspond with HTTP verbs. SELECT with GET; INSERT and UPDATE with POST,PUT, and PATCH, and (shock of shocks) DELETE with DELETE.
For those API operations that aren’t easily tabularized, the CData drivers make judicious use of Stored Procedures. This might mean uploading a file to Sharepoint, or manually working through an OAuth flow.
For data sources with nested values (like JSON or XML) the connector will return the full aggregate of the value. The connectors do support JSON and XML parsing functions in the SQL query to drill down into the nested data as needed.
For all supported data sources, the connectors leverage collaborative query processing. This means that whenever possible, complex querying is pushed down to the server, minimizing the need for client-side processing. A built-in SQL engine manages whatever functionality isn’t supported at the source and processes data on the client side. For example, a source might support filtering, but not JOINs. In this case, the filter is passed into the HTTP/protocol request and the JOIN is performed in-memory on the client side.
So what are the CData Power BI Connectors?
These are native Power BI connectors that utilize the custom connector functionality of Power BI.
The setup is practically identical to ODBC (meaning you configure the connection via a DSN).
Since the connectors leverage the custom connector functionality, DirectQuery is available (unlike ODBC)
As I’ve mentioned before, the CData connectors enable connectivity to more than 110 sources, both on-premise and in the cloud.
CData Connectors have built-in, optimized data processing (we’re the fastest connectors in the business, often only limited by web speeds when it comes to returning data).
Thanks to deliberate API implementations, each connector is able to push down all supported request features based on the data source.
CData drivers have a robust, innovative SQL to NoSQL interface, offering flattening of nested data and the ability to treat hierarchical structures as separate tables or as a single table built with implicit JOINs.
Why should you use the CData Power BI Connectors?
As businesses grow, so to does the number of data sources they use. Studies show that the average enterprise utilizes 20 cloud-based data sources and at least as many on-premises data stores. Connecting to this disparate data often incurs high development and maintenance costs. And even if you had standard connectivity (via ODBC driver), LIVE access to your disparate data was only a dream.
With an Import connection, you likely dealing with static data or you need to schedule a refresh of the data.
With CData Connectors, the development and maintenance is done for you. And since the DirectQuery functionality is supported, whenever you refresh your dashboards, visualizations, and reports, the underlying data is updated live data from the source.
With the explanations out of the way, it’s time to see the connectors in action. I’ll be performing live demos of our MongoDB and QuickBooks Desktop connectors. I chose on-premise sources since you never know what your connectivity will be like at a conference.
With MongoDB, we can take a look at how the connectors handle the SQL to NoSQL interface and how we push down queries.
With QuickBooks, we’ll get a look at a more business-oriented source that can be used to build visualizations and analytics that result in actionable insights.
Let’s start with MongoDB –
For this demo, we’ll be connecting to a pared down version of the restaurants primer dataset provided by MongoDB.
Let’s start by taking a look at what a document in the restaurants collection looks like.
As you can see, we’ve got some nested objects and arrays in our documents. With the CData Power BI connectors, how this data is parsed is fully configurable. You can choose to leave all objects as aggregates or choose to flatten the objects, the arrays, or both. When the data is flattened, we use dot notation (which often gets translated into underscores for various tools, including Power BI) to denote the nested structure.
The connectors are capable of creating schema files for NoSQL data, to allow further customization of data parsing or to simply accelerate data consumption. The drivers determine the schema through intelligent row scanning and data typing. Before we go any further, lets take a look at configuring our connection to MongoDB. When you configure the connection, you can set the server and port, any authentication (including an authentication database), and configure the NoSQL to SQL interface. For this demo, I want to flatten all objects and I want to flatten the first 2 elements in any arrays we encounter. We can even configure the connector to create schema files and have full control over where those schema files are saved.
Here is a sample schema for the restaurants collection. As you can see, all of the top level fields are easily parsed as columns. The address object is flattened, as are both elements of the coord array. The grades array is also flattened as are the objects that serve as array elements. Note that the dot notation to represent hierarchical data becomes underscore notation for the column names.
With the configuration done, we’re ready to test the connection and get started.
We’ll start with a new Power BI report. When we click data, we can search for CData or click the other tab and find the connector we want. From there, the sequence is just like getting data from any other source. We navigate into the database, select the “table” that we want, and click load data.
From here, we’re ready to build a visualization. In this case, we want to get a map of all of the restaurants, coloring the entries based on the borough and using the score to determine the size of the dot. In the tooltip, we’ll put the name and cuisine of the restaurant. If we wanted, we could filter the results by cuisine. Each time we change the fields and filters, a new query is sent to the MongoDB database and fresh data is returned. The SQL request created by Power BI is translated into a MongoDB request and pushed down to the MongoDB server. Whatever query functionality isn’t supported by MongoDB will be handled in-memory by the SQL Engine built into the connector.
I’ve got a log file that shows the SQL query created by Power BI and the subsequent MongoDB request. So you can see that the specific fields are requested and the filter is applied at the server level, instead of importing the data and relying on the installation machine to handle the data processing.
Next, let’s take a look at an integration with QuickBooks Desktop. (Worth nothing: we have connectors for QuickBooks Online and QuickBooks Point of Sale as well).
Since we’re connecting to structured QuickBooks data, we can jump right into the configuration. Now, it should be said that our QuickBooks Desktop connector comes bundle with another app that eases connectivity to QuickBooks desktop data. The Remote Connector simply provides an easy-to-use web-based proxy for servicing requests between Apps and QuickBooks desktop. We’ve already configured a user for the company file we’ll be working with.
In the DSN, there isn’t actually much for us to do. We simply configure the user and password for the Remote Connector user with access to our Company File. Since our remote connector is local, we get to use the default connection property values. Click test connection and we’re ready to go.
For QuickBooks, we’ll build a stacked chart that for the vendors to display bill payments, by check or credit card. To do so, we want to JOIN the Vendors, BillPaymentChecks and BillPaymentCreditCards tables together. We can do this from the Relationships tab. By JOINing the tables and requesting a limited data set, we drastically reduce the amount of data to be processed by PowerBI, offloading the bulk of the work to the QuickBooks machine. Now, for the demo, that won’t improve much, since QuickBooks is running on the laptop, but in a production environment, it would make a difference. With the relationships configured we’re ready to build our visualization, leveraging the configured relationship to inform the aggregations by Vendor. From there, we can build our chart. And since the connectors use a directquery, every time the visualization is refreshed, new data is requested from QuickBooks.
As you’ve seen, the CData Power BI Connectors provide live connectivity to data from more than 110 different sources
Two key benefits include optimized data processing (we’re seriously the fastest connectors on the market) and collaborative query processing, meaning that you can rely on the data source to manage complex queries and know that you’re working with the minimum amount of data in Power BI.
Learn more (and download a beta or 5) from cdata.com/powerbi
Any questions?