Open Source Business
Intelligence Overview
From Data Source to Analytics and Beyond
Agenda
● Open Source and BI
● Data sources
● Data Integration
● Reporting/Frontend
● Analytics
● Data Quality
● Data Gover...
Source: https://www.informs.org/ORMS-Today/Public-Articles/October-Volume-37-Number-5/Back-in-Business
Data Sources
Traditional
○ PostgreSQL - http://www.postgresql.org/
■ Pivotal Greenplum - http://gopivotal.com/
○ MySQL - h...
Relational vs Columnar
Source: http://www.calpont.com/images/column-oriented-database.jpg
Data Sources
NoSQL
○ Cassandra - http://cassandra.apache.org/
○ MongoDB - http://www.mongodb.org/
○ CouchDB - http://couch...
Source: http://gerardnico.com/wiki/database/oracle/oracle_olap
The Next Wave of Data Sources
Virtualization
○ Teiid - http://www.jboss.org/teiid/
Semantic Web/Graph
○ Sesame - http://ww...
Source: http://www.ebizq.net/blogs/guest_session/2009/12/putting-data-to-work-for-cloud-bpm-mdm-and-soa-projects.php
Graph Database
Source: http://en.wikipedia.org/wiki/Graph_database
Data Integration
Kettle - http://kettle.pentaho.com/
Talend - http://www.talend.com/
CloverETL - http://www.cloveretl.com/
Reporting
BIRT (Actuate) - http://www.eclipse.org/birt/phoenix/
Pentaho - http://reporting.pentaho.com/
Jaspersoft - http:...
Full Stacks
SpagoBI - http://www.spagoworld.org/xwiki/bin/view/SpagoBI/#
Pentaho - http://www.pentaho.com/
Jaspersoft - ht...
Analytics
R - http://www.r-project.org/
Weka - http://www.cs.waikato.ac.nz/ml/weka/
RapidMiner - http://rapid-i.com/conten...
Data Quality
Profiling
○ DataCleaner - http://datacleaner.org/
○ DQGuru - http://www.sqlpower.ca/page/dqguru
Suites
○ Tale...
Data Governance
MDM
○ Talend - http://www.talend.com/resource/data-governance.html
Business Rules Engine
○ JBoss Drools - ...
Open Source BI Overview
Open Source BI Overview
Upcoming SlideShare
Loading in …5
×

Open Source BI Overview

5,406 views

Published on

Proof that an entire data driven Business Intelligence stack can be successfully implemented through open source software.

Published in: Technology

Open Source BI Overview

  1. 1. Open Source Business Intelligence Overview From Data Source to Analytics and Beyond
  2. 2. Agenda ● Open Source and BI ● Data sources ● Data Integration ● Reporting/Frontend ● Analytics ● Data Quality ● Data Governance
  3. 3. Source: https://www.informs.org/ORMS-Today/Public-Articles/October-Volume-37-Number-5/Back-in-Business
  4. 4. Data Sources Traditional ○ PostgreSQL - http://www.postgresql.org/ ■ Pivotal Greenplum - http://gopivotal.com/ ○ MySQL - http://www.mysql.com/ ■ Percona - http://www.percona.com/ ■ MariaDB - https://mariadb.org/ Columnar ○ MySQL Derivatives ■ InfiniDB - http://infinidb.org/ ■ Infobright - https://www.infobright.com/ ○ MonetDB - http://www.monetdb.org/Home
  5. 5. Relational vs Columnar Source: http://www.calpont.com/images/column-oriented-database.jpg
  6. 6. Data Sources NoSQL ○ Cassandra - http://cassandra.apache.org/ ○ MongoDB - http://www.mongodb.org/ ○ CouchDB - http://couchdb.apache.org/ ○ Infinispan - http://www.jboss.org/infinispan/ ○ Hadoop - http://hadoop.apache.org/ ■ HBase - http://hbase.apache.org/ ■ Hive - http://hive.apache.org/ OLAP ○ Mondrian - http://mondrian.pentaho.com/
  7. 7. Source: http://gerardnico.com/wiki/database/oracle/oracle_olap
  8. 8. The Next Wave of Data Sources Virtualization ○ Teiid - http://www.jboss.org/teiid/ Semantic Web/Graph ○ Sesame - http://www.openrdf.org/ ○ Neo4j - http://www.neo4j.org/ ○ OrientDB - http://www.orientdb.org/ ○ Infogrid - http://infogrid.org/trac/
  9. 9. Source: http://www.ebizq.net/blogs/guest_session/2009/12/putting-data-to-work-for-cloud-bpm-mdm-and-soa-projects.php
  10. 10. Graph Database Source: http://en.wikipedia.org/wiki/Graph_database
  11. 11. Data Integration Kettle - http://kettle.pentaho.com/ Talend - http://www.talend.com/ CloverETL - http://www.cloveretl.com/
  12. 12. Reporting BIRT (Actuate) - http://www.eclipse.org/birt/phoenix/ Pentaho - http://reporting.pentaho.com/ Jaspersoft - http://community.jaspersoft.com/ Saiku - http://meteorite.bi/saiku
  13. 13. Full Stacks SpagoBI - http://www.spagoworld.org/xwiki/bin/view/SpagoBI/# Pentaho - http://www.pentaho.com/ Jaspersoft - http://www.jaspersoft.com/
  14. 14. Analytics R - http://www.r-project.org/ Weka - http://www.cs.waikato.ac.nz/ml/weka/ RapidMiner - http://rapid-i.com/content/view/181/
  15. 15. Data Quality Profiling ○ DataCleaner - http://datacleaner.org/ ○ DQGuru - http://www.sqlpower.ca/page/dqguru Suites ○ Talend - http://www.talend.com/products/data-quality Testing ○ SQLUnit - http://sqlunit.sourceforge.net/ ○ dbFit - http://benilovj.github.io/dbfit/ ○ etlUnit - https://github.com/dbaAlex/etlUnit (shameless plug :p )
  16. 16. Data Governance MDM ○ Talend - http://www.talend.com/resource/data-governance.html Business Rules Engine ○ JBoss Drools - http://www.jboss.org/drools/ ○ Open Rules - http://openrules.com/

×