Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Source Business Intelligence Overview

0 views

Published on

Published in: Technology, Business
  • Be the first to comment

Open Source Business Intelligence Overview

  1. 1. Open Source Business Intelligence Tools Alex Meadows TriLUG, January 2012
  2. 2. Agenda <ul><li>Business Intelligence Overview
  3. 3. Review of OSBI Tools </li></ul><ul><ul><li>Data Warehousing
  4. 4. Data Integration
  5. 5. Reporting/OLAP
  6. 6. Visualization
  7. 7. Statistical Analysis/Predictive Analytics </li></ul></ul>
  8. 8. What Is Business Intelligence? Utilizing technology to identify and analyze trends in data to make better business decisions .
  9. 9. Source: Back In Business, Klimberg, Miori (www.informs.org) Overlapping Fields
  10. 10. Source: Competing on Analytics; Thomas Davenport, Jeanne Harris Competing On Analytics
  11. 11. Phases of Growth
  12. 12. The Three Types of Questions <ul><li>What happened? </li></ul><ul><ul><li>How was performance last week? </li></ul></ul><ul><li>What is currently happening? </li></ul><ul><ul><li>How is performance right now? </li></ul></ul><ul><li>What will happen? </li></ul><ul><ul><li>What can I do to reach our goals? </li></ul></ul>
  13. 13. Data Warehousing <ul><li>Store data outside of application/normal business environment (i.e. ERP systems)
  14. 14. Specific for reporting/analytics
  15. 15. Modeling Styles </li></ul><ul><ul><li>3 NF (normal database modeling)
  16. 16. Data Marts (aka star schemas)
  17. 17. Data Vault (hybrid 3NF/Data Mart)
  18. 18. Anchor Modeling (6NF) </li></ul></ul>
  19. 19. Data Warehousing <ul><li>Databases </li></ul><ul><ul><li>MySQL, Postgres, etc </li></ul></ul><ul><li>Columnar Data Stores </li></ul><ul><ul><li>Infobright*, LucidDB, InfiniDB*, etc. </li></ul></ul><ul><li>Hybrid Data Warehouse Databases </li></ul><ul><ul><li>Greenplum* (both RDBMS and Columnar) </li></ul></ul><ul><li>NoSQL </li></ul><ul><ul><li>Hadoop, CouchDB, MongoDB, etc. </li></ul></ul><ul>* Hardware and/or Software limitations in community editions </ul>
  20. 20. RDBMS vs Columnar <ul>Source: http://www.calpont.com/column-oriented-database-bi </ul>
  21. 21. NoSQL? <ul><li>Not Only SQL
  22. 22. Unstructured/semi-structured data
  23. 23. Huge (multi-terrabyte to petabyte+ data sets) </li></ul>Source: http://www.information-management.com/specialreports/20040622/1005301-1.html
  24. 24. Data Integration <ul><li>Syncing data across systems
  25. 25. Includes: </li></ul><ul><ul><li>ETL (Extract, Transform, Load)
  26. 26. MDM (Master Data Management)
  27. 27. EAI (Enterprise Application Integration)
  28. 28. EII (Enterprise Information Integration) </li></ul></ul>
  29. 29. Talend <ul><li>Data Management Tool Suite </li></ul><ul><ul><li>ETL
  30. 30. MDM
  31. 31. Data Profiling
  32. 32. Data Quality </li></ul></ul><ul><li>Code generator
  33. 33. Eclipse based
  34. 34. Extensible plugin architecture </li></ul>
  35. 36. Pentaho K.E.T.T.L.E. <ul><li>Kettle Extraction, Transport, Transformation, and Loading Environment
  36. 37. Focus on ETL
  37. 38. Extensible plugin architecture
  38. 39. Engine based </li></ul>
  39. 41. Reporting <ul><li>Focus: Historical Analysis </li></ul>
  40. 42. Reporting Options <ul>*Flat Files, NoSQL, etc. </ul>MDX “ Pivot Table” Charting SQL Other Sources* Drill Through Parameterized BIRT ✔ ✔ ✔ ✔ ✔ ✔ Pentaho ✔ ✔ ✔ ✔ ✔ ✔ JasperReports ✔ ✔ ✔ ✔ ✔ ✔ SQL Power Wabit ✔ ✔ ✔ ✔ ✔ ✔ ✔ Saiku ✔ ✔ ✔ ✔ ✔ ✔ ✔
  41. 43. BIRT Example
  42. 46. Visualization <ul><li>Focus: Trending and Present </li></ul>
  43. 50. Pentaho CDE/CDF <ul><li>Dashboard framework and editor built into Pentaho BI Server
  44. 51. Community developed – uses open web languages (Javascript, HTML, etc). </li></ul>
  45. 54. Statistics/Predictive Analytics <ul><li>Focus: All relevent data used to predict outcomes </li></ul>
  46. 55. Statistics/Predictive Analytics <ul><li>R – stats oriented
  47. 56. Weka – machine learning oriented
  48. 57. RapidMiner – mixed </li></ul><ul><ul><li>Originally YALE
  49. 58. Weka and R Plugins
  50. 59. Like SAS Enterprise Miner </li></ul></ul>
  51. 61. BI From Reporting to Statistical Analysis <ul>* Utilizes Talend ETL ** Utilizes Weka Data Mining *** All use Mondrian for OLAP, with different front ends </ul>ETL Metadata Reporting Dashboards OLAP*** Statistics Automated Decisions Jaspersoft ✔* ✔ ✔ ✔ Pentaho ✔ ✔ ✔ ✔ ✔ ** SpagoBI ✔* ✔* ✔ ✔ ✔ ✔ **
  52. 62. Shameless Plug <ul><li>RTP Pentaho User Group </li></ul><ul><ul><li>On LinkedIn (soon to be also on Meetup)
  53. 63. Meets quarterly </li></ul></ul>

×