The State of Open Source Business IntelligenceChristian Donner
Getting from data to the source of a problem can be hard ...   A czar learned that the most disease-ridden province of   h...
… or easy …"How would you rate the overall job President George W. Bush is doing as                                preside...
Poll•   Who has implemented something that you would define as    a BI solution before, either in your own organization or...
Why this presentation?•   2 years ago I started a low-budget BI project•   Researched many products and technologies•   OS...
Agenda•   What is Business Intelligence?•   BI Trends•   OSBI Trends•   Products     • Pentaho     • Jaspersoft     • Open...
Business Intelligence – A Definition•   Business Intelligence     •   In 1989 Howard Dresner (Gartner Group) created the t...
Business Intelligence - Components                                     www.molecular.com   8
Business Intelligence Platform•   Integrate with business    processes•   Manage and schedule    reports•   Deliver report...
Agenda•   What is Business Intelligence?•   BI Trends•   OSBI Trends•   Products     • Pentaho     • Jaspersoft     • Open...
Forecast: Business intelligence market growth                                               Actual              Forecast  ...
Mainstream BI Theme• Keith Gile, Forrester Research:   “We are witness to a change in BI that shifts the     emphasis away...
Forrester Wave™: BI Enterprise Reporting, Q1 ‘06                                                     Where are            ...
Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products   • Pentaho   • Jaspersoft   • OpenI   • BIRT   •...
Why Open Source?              Source: Survey by Computer Economics, Frank Scavo                                           ...
The Jaspersoft Story: BI for Everyone    Operational executives are being asked to make decisions more critical to the   C...
Organizational Involvement with OS BI                Don’t Know                                  Has deployed open source ...
Comparison of OS BI with Commercial BI     Cost of Ownership                                                              ...
Extranet Applications - The “Beachhead” of Open Source BI?•   Technology requirements favor open source     •   Pure J2EE ...
Free software for sale!•   Community-based vs. for-profit companies•   Open Source has become a business model•   Acquisit...
Example: Jaspersoft Business Model     JasperIntelligence Product Family      •   JasperReports, iReports      •   Jasper...
Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products   • Pentaho   • Jaspersoft   • OpenI   • BIRT   •...
OSBI Explosion•   There are about 25 products competing    in this space, about half of which did not    exist prior to 20...
Open Source Reporting Tools  •   Eclipse BIRT (Actuate)  •   Jasper Reports  •   JFreeReport  •   DataVision  •   Open Rep...
Eclipse BIRT•   (Business Intelligence) and Reporting Tools     •   Eclipse Report Designer (ERD)     •   Eclipse Report E...
BIRT 2.0 Features•   Released January 20, 2006•   Re-Use Library – A report component environment allows developers    wit...
Open Source OLAP Tools •   Mondrian •   JPivot •   gOLAP •   PALO •   pocOLAP                         www.molecular.com   27
Open Source ETL Tools  •   Clover ETL  •   CPluSQL  •   Enhydra Octopus  •   JetStream  •   KETL  •   Kettle  •   OpenDigg...
Open Source BI Suites  •   BEE  •   Bizgres  •   Openi  •   Pentaho  •   SpagoBI                        www.molecular.com ...
JasperIntelligence Architecture                                                                                           ...
Pentaho          Source: Pentaho           www.molecular.com   31
OpenI        Source: OpenI/Loyalty Matrix                                       www.molecular.com   32
OpenI at a Glance•   J2EE Web Application•   Standards-based, integrates other Open Source    components•   Connectors for...
Bizgres•   Sponsored by Greenplum•   Bizgres is a distribution of PostgreSQL (Open Source DB)•   Bizgres includes the foll...
Bizgres Clickstream Architecture                      Source: Greenplum                                          www.molec...
Who leads the pack?                             Downloads by Month       60000       50000       40000       30000       2...
Agenda•   What is Business Intelligence?•   BI Trends•   OSBI Trends•   Products     • BI suites     • ETL tools     • OLA...
The State of Open Source Business Intelligence•   “Business intelligence” is a broad umbrella term•   Lot of buzz in the m...
Thank you!                         Q&AWould I go with Open Source BI today? How about you?                                ...
Upcoming SlideShare
Loading in …5
×

Open sourcebi

539 views
387 views

Published on

about open source platforms and data

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
539
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • JasperReports JasperReports is one of the oldies as well, starting in 2001. More recently a company, JasperSoft has been formed to invest in JasperReports, as well as to provide support, training and various other services. JasperSoft represents the JasperReports project in consortiums, such as Bizgres. Agata Report From their web site..."Agata Report is a Database Reporting Tool and EIS tool, MIS tool (graph generation), like Crystal Reports. Its written in PHP-GTK and allows you to edit and get SQL results from several databases (PostgreSQL, MySQL, Oracle, SyBase, MsSql, FrontBase, DB2, Informix and InterBase) as as PostScript, plain text, HTML, XML, PDF, or spreadsheet (CSV) formats through its graphical interface. You can also define levels, subtotals, and a grand total for the report, merge the data into a document, generate address labels, or even generate a complete ER-diagram from your database." DataVision DataVision is an Open Source Report Writer that allows drag-and-drop report design through its GUI. It is written in Java and can connect to any database supporting JDBC. OpenReports From their website... "OpenReports is a flexible open source web reporting solution that allows users to generate dynamic reports in a browser. OpenReports uses JasperReports, an excellent full featured open source reporting engine, and was developed using leading open source components including WebWork, Velocity, Quartz, and Hibernate and includes full support for JasperReports." They've recently announced OpenReports Portal Edition that blends OpenReports with the Apache Jetspeed Enterprise Portal system. Also of interest are the related projects of ObjectVisualizer and OpenReports Designer OpenRPT OpenRPT is a full featured, cross-platform SQL report writer that stores its report definitions as XML, and has a WYWIWYG report writer that can be used in stand-alone or embedded fashion. JFreeReport jFreeReport is standalone Java report library with a nice series of capabilities and a decent community around it. In January, 2006, jFreeReport became a part of the Pentaho suite. (source: http://www.squidoo.com/osbi)
  • Mondrian Mondrian is one of the oldest open source BI components, having been registered in 2001. It is also used as the OLAP engine in other open source software OLAP and BI Suite projects. JPivot JPivot is a JSP tag library supporting XMLA that provides a front-end OLAP table to the Mondrian OLAP engine, allowing typical OLAP functions such as slice-and-dice, drill-down and roll-up. gOLAP Gratis OLAP [gOLAP] has been in the planning stage since its registration on SourceForge in 2001. There are some files in the CVS, but nothing has been released. From its SourceForge description... "gOLAP is a BSD-licensed OLAP server engine and client API. It is a hypercube-based Analytical Processing engine intended for general high performance applications." PALO PALO is a recent entry to the open source software OLAP field. It's different in that it is esentially an add-in for Micorsoft Excel. PALO provides a MDDB for Excel, with future plans to allow access through other APIs as well. From their homepage... "Palo is an advanced data store for Microsoft Excel that allows you to handle large amounts of Excel data on a small number of worksheets. In addition, it also allows you to share Excel data real-time with your collegues." pocOLAP pocOLAP is a web-based, cross-tab reporting tool written in Java, that also allows for drill-down. The name comes from "poco", meaning "little" in the Italian and Spanish. (source: http://www.squidoo.com/osbi)
  • KETL KETL is an ETL for high volume transactions developed by Kinetic Networks and delivered as part of the Bizgres suite. This links provides an index of documents from Kinetic Networks. KETL First Meeting Read our first interview with the KETL team. Enhydra Octopus Enhydra Octopus is part of the ObjectWeb GForge project, providing JDBC Data Transformations Pequel ETL Pequel ETL is, according to their SourceForge description, a comprehensive and high performance data processing/transform system. It features a simple, user-friendly event driven scripting interface that transparently generates & executes highly efficient Perl/C code. Uses: ETL, datawarehousing, statistics, and data-cleansing. Clover ETL Clover ETL is an open source Java based framework for building data transformations (ETL applications). CpluSQL The cplusql distributed ETL tool extracts and transforms row based data from databases and flat files for terabyte scale datawarehouse loading. JetStream JetStream is the first open source ETL tool that we used. It is described as a Java Extraction Transformation Service for Transmitting Records & Exchanging Application Metadata: a Java-based ETL/EAI tool. KETTLE Don't confuse KETL and KETTLE - they're not related. K.E.T.T.L.E (Kettle ETTL Environment) is a meta-data driven ETTL tool. (Extraction, Transformation, Transportation & Loading) openDigger OpenDigger is a java based compiler for the xETL language. xETL is a language specifically projected to read, manipulate and write data in any format and database. With OpenDigger/XETL you can build powerful Extraction-Transformation-Loading (ETL) prograns. (source: http://www.squidoo.com/osbi)
  • BEE Project BEE is one of the first open source BI Suites, having been around since 2002. It provides ETL, ROLAP, reporting, integration with the R Project, is written in PERL, and primarily supports MySQL. Bizgres Bizgres is a distribution of PostgreSQL with specific modifications to increase performance and use as a data warehouse. In addition, the Bizgres project comes with the KETL ETL tool and JasperReports. The Bizgres project is supported by a consurtium of three companies, Greenplum, Kinetic Networks, and JasperSoft. OpenReports Portal MarvelIT's OpenReports Portal provides Reporting, Charting and Portal capabilities. Open i Open i provides a web-driven interface to OLAP, relational, statistical and data mining sources giving BI integrators user interface, report definition and connector tools. Pentaho Pentaho has been getting a lot of attention since its launch and funding in 2005. This project has an impressive pedigree in its team leaders, and provides quite an array of capabilities: Reporting, Analysis, Dashboards, Data Mining and Workflow. SpagoBI SpagoBI is a BI platform drawing its components from the ObjectWeb consortium. Tools include metadata management, ETL, Reporting, Analysis, and Dashboards. (source: http://www.squidoo.com/osbi)
  • OpenI is a J2EE web application, by default running on Tomcat. It publishes web-based analytical reports from 3 types of data sources – OLAP servers, relational database servers, and data mining servers. It has 3 key component categories: Connectors Connectors’ job is to speak the native tongue of individual analytical data sources. For relational data sources, OpenI uses JDBC since it is well known and standardized. For OLAP data sources, OpenI uses XMLA as the standard protocol to communicate. This protocol is supported by several OLAP servers including Microsoft Analysis Services and Mondrian (an open source OLAP server). For data mining datasets, OpenI integrates with the R project , a popular open source data mining platform, using a native API called RServe . ( only XMLA is operational in the current release ) Report Definitions OpenI uses data-source specific report definition languages (RDL’s) to define and track the reports created on the platform. Wherever possible, OpenI uses existing standard RDL’s from other open source projects such as the .jrxml definition from JasperReports for relational database reports. For OLAP and data mining reports, OpenI implements its own XML-based RDL to define the report schema. By publishing this codebase into open source space, we hope that these RDL’s will become more standard (and robust) via community feedback and contribution. User Interface The UI for OpenI brings various existing public domain work into a single platform, mainly with the intent to make the platform extremely user friendly to a non-technical user. It is more designed for the “business analyst” rather than the “database developer”. For charting components and pivot table components, it heavily utilizes components from JPivot and JFreeChart , and unifies them in a single, consistent navigation framework. Realizing that analytical applications usually need to be embedded into existing enterprise portals, we are also leveraging the upcoming portlet features of JPivot to better integrate with JSR-168 compliant portals. A key UI feature of OpenI is the administrator interface where a user can create and publish new reports from existing data sources entirely via a web interface, without having to write any code or query. Also available are features like publishing in private versus public folders, customization of chart components, color palettes, etc. Security OpenI uses a form-based authentication that is integrated with the J2EE security structure, i.e. you can use any of the security realms defined in the J2EE configuration to authenticate the login. OpenI also provides integration between J2EE security and datasource security allowing the datasource to enforce fine grained data permissions. This way, user or group-specific access policies get enforced at the data source level, enabling hierarchical data access policies. For example, a user may only see the specific subset of the cube data as permitted by the OLAP security rules for their login.
  • The term “Business Intelligence” has only been in use for a few years. From the stone age of computing until only a few years ago, it was called “reporting”. In the late 80ies, the Information Warehouse was conceived. The idea was to leave data where it was and access it from anywhere with tools. Needless to say, this fad was short-lived. Soon thereafter, in the mid-90ies, Ralph Kimball published his first Data Warehousing book. Arguably, the concept of what we mean by Business Intelligence today was coined in those days. Data is extracted from operational systems, processed and stored in repositories especially designed for analysis. I don’t remember hearing the term Business Intelligence until a few years ago, though, around 2001. Dashboards, Key Performance Indicators and Scorecards brought Business Intelligence closer to the executive office. This trend is still happening. Only in the last year or 2, Open Source appeared in the world of Business Intelligence.
  • Open sourcebi

    1. 1. The State of Open Source Business IntelligenceChristian Donner
    2. 2. Getting from data to the source of a problem can be hard ... A czar learned that the most disease-ridden province of his empire was also the province with the most doctors. His solution? He promptly ordered all the doctors shot dead. (He clearly lacked Business Intelligence) Folktale from: Freakonomics - A Rogue Economist Explores the Hidden Side of Everything (Steven D. Levitt, Stephen J. Dubner) www.molecular.com 2
    3. 3. … or easy …"How would you rate the overall job President George W. Bush is doing as president -- excellent, pretty good, only fair, or poor? Excellent or pretty good 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% December December December November November November September September November February February February February February February January January October October October October August August August August July August July June June June June March June March March May May May April April April April April 2001 2002 2003 2004 2005 2006 Source: Harris Poll, published by the Wall Street Journal Online on 5/12/2006 www.molecular.com 3
    4. 4. Poll• Who has implemented something that you would define as a BI solution before, either in your own organization or for someone else?• Out of this group, who has used an Open Source BI product?• Survey on http://cdonner.com (20 responses): Currently using BI 85% Currently using OSBI 40% Evaluated OSBI in the past 40% Planning to use OSBI 35% www.molecular.com 4
    5. 5. Why this presentation?• 2 years ago I started a low-budget BI project• Researched many products and technologies• OSBI was practically non-existent• Decided to go with Microsoft DTS and SQL RS• Today, the landscape has changed dramatically• I wanted to know: would I go with Open Source BI today? www.molecular.com 5
    6. 6. Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products • Pentaho • Jaspersoft • OpenI • BIRT • Bizgres • Mondrian• Demo www.molecular.com 6
    7. 7. Business Intelligence – A Definition• Business Intelligence • In 1989 Howard Dresner (Gartner Group) created the term "BI“: “A set of concepts and methods to improve business decision- making by using fact-based support systems.”• Wikipedia: • the technology used for collecting and analyzing business information • a set of business processes for this purpose • the information obtained from these processes• Includes: • ETL Tools • OLAP/Data Analysis Tools • Reporting Tools • Databases www.molecular.com 7
    8. 8. Business Intelligence - Components www.molecular.com 8
    9. 9. Business Intelligence Platform• Integrate with business processes• Manage and schedule reports• Deliver reports through multiple channels, push and pull model support• Maintain user security• Seamlessly integrate via open standards with portals and applications www.molecular.com 9
    10. 10. Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products • Pentaho • Jaspersoft • OpenI • BIRT • Bizgres • Mondrian• Demo www.molecular.com 10
    11. 11. Forecast: Business intelligence market growth Actual Forecast $8,000 BI services revenue BI maintenance revenue BI license revenue $6,000 $4,000 $2,000BI market size(US$ millions) $0 2003 2004 2005 2006 2007 2008 Size $5,253M $5,596M $5,997M $6,506M $7,005M $7,331M Growth N/A 6.5% 7.2% 8.5% 7.7% 4.7% Source: Forrester Research, “Business Intelligence Growth Is Driven By Compliance, Standardization, And Performance Initiatives” www.molecular.com 11
    12. 12. Mainstream BI Theme• Keith Gile, Forrester Research: “We are witness to a change in BI that shifts the emphasis away from functionally powerful tools for power-user “producers” toward context-sensitive BI solutions for a large community of “consumers” of information.”• Paul Doscher, CEO Jaspersoft: “The big commercial tool providers can handle performance management applications well, but left Operational BI behind.”• License bottleneck • Lower-level in-house user • Public web sites www.molecular.com 12
    13. 13. Forrester Wave™: BI Enterprise Reporting, Q1 ‘06 Where are the Open Source contenders? Source: Forrester Research www.molecular.com 13
    14. 14. Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products • Pentaho • Jaspersoft • OpenI • BIRT • Bizgres • Mondrian• Outlook www.molecular.com 14
    15. 15. Why Open Source? Source: Survey by Computer Economics, Frank Scavo www.molecular.com 15
    16. 16. The Jaspersoft Story: BI for Everyone Operational executives are being asked to make decisions more critical to the Corporation more frequently especially with the added scrutiny of SOX compliance.Legacy BI Companies have failed to solve the Problem “We have Business Objects installed on our corporate data warehouse and 80% of our users only use 20% of the functionality.” Michael Heschel, EVP Information Systems and Services, The Kroger Company “A typical installation for a 1000 users for the full BI suite could run As high as $450k - $700k for software licenses alone.” Intelligent Enterprise, August 2005 The full Business Objects product takes 180 CDs to install!!! This slide © 2005 JasperSoft, Inc. www.molecular.com 16
    17. 17. Organizational Involvement with OS BI Don’t Know Has deployed open source Not Considering open BI software 8% source BI software 9% 21% 19% 43% In development with Considering open open source BI software source BI software (c) 2006 Ventana Research Open Source and BI Research www.molecular.com 17
    18. 18. Comparison of OS BI with Commercial BI Cost of Ownership 77% Openness/Flexibility 80% Database Support 72% Reliability 69% Metadata Support 62% Manageability 57%Scalability/Performance 61% Ease of Use 57% Significantly more Capable More Capable Equivalently Capable Less Capable Significantly less Capable Don’t know (c) 2006 Ventana Research Open Source and BI Research www.molecular.com 18
    19. 19. Extranet Applications - The “Beachhead” of Open Source BI?• Technology requirements favor open source • Pure J2EE offerings provide a better technology fit than legacy BI technology• Licensing requirements contradict prevailing proprietary models • “Named user” only – doesn’t map to extranet usage • Role-based – meaningless in extranets • >$1,000 USD per name user – cost prohibitive • Net/net: The “old school” BI licensing model breaks down www.molecular.com 19
    20. 20. Free software for sale!• Community-based vs. for-profit companies• Open Source has become a business model• Acquisition of your vendor can change the terms under which you use OS SW• Example: Bill Venners account of using Jive for Artima.com• Example: Snort, Sale of Martin Roesch’s Checkpoint Software• Whatever you do, factor in that your Open Source product may not always remain that. www.molecular.com 20
    21. 21. Example: Jaspersoft Business Model  JasperIntelligence Product Family • JasperReports, iReports • Jasper Decisions, Jasper Server • Soon: JasperAnalytics, JasperETL  Commercial / Dual license • Services packages on Subscription basis (JS & JR) • Commercial License, Support, Training, Documentation • CPU based Pricing plus Support (JD) • Support pricing (JR) • Incident support plus three annual support options from web based self- service to comprehensive 24x7x265  Leveraging strong and loyal community • SourceForge  JasperForge This slide © 2005 JasperSoft, Inc. www.molecular.com 21
    22. 22. Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products • Pentaho • Jaspersoft • OpenI • BIRT • Bizgres • Mondrian• Outlook www.molecular.com 22
    23. 23. OSBI Explosion• There are about 25 products competing in this space, about half of which did not exist prior to 2005.• Many of them will probably return to insignificance• Because we are so early in the maturity cycle, it is difficult to make judgments about who will make it. www.molecular.com 23
    24. 24. Open Source Reporting Tools • Eclipse BIRT (Actuate) • Jasper Reports • JFreeReport • DataVision • Open Reports • OpenRPT • Agata Reports www.molecular.com 24
    25. 25. Eclipse BIRT• (Business Intelligence) and Reporting Tools • Eclipse Report Designer (ERD) • Eclipse Report Engine (ERE) • Eclipse Charting Engine (ECE) • Web Based Report Designer (WRD) Source: Actuate www.molecular.com 25
    26. 26. BIRT 2.0 Features• Released January 20, 2006• Re-Use Library – A report component environment allows developers with a range of expertise to share report components or functions for reuse.• Page-on-Demand HTML- A page-on-demand navigation mechanism enables the efficient viewing of large report documents over the internet. • CSS Style Sheets – External style sheets can be used across multiple report designs, making it easy to establish a common look across all reports in one application. • Scripting Editor – BIRT supports the ability to code or script the behavior of reports using a perspective for Java Code Editing for BIRT reports.• Large, Persistent Reports – Report developers can generate a report and then distribute a URL to end-users.• Improved Charting Facility, Scripting – BIRT 2.0 includes a wizard for building common usage charts and advanced capabilities for including detailed charts within a report design. www.molecular.com 26
    27. 27. Open Source OLAP Tools • Mondrian • JPivot • gOLAP • PALO • pocOLAP www.molecular.com 27
    28. 28. Open Source ETL Tools • Clover ETL • CPluSQL • Enhydra Octopus • JetStream • KETL • Kettle • OpenDigger www.molecular.com 28
    29. 29. Open Source BI Suites • BEE • Bizgres • Openi • Pentaho • SpagoBI www.molecular.com 29
    30. 30. JasperIntelligence Architecture USER COMMUNITY Customers B usiness Analyst Domain User Executi ve OUTPU T HTML + AJAX PDF MS EXCEL MS WORD JasperExplorer HTTP, SOAP, Web Services, Java API Cube / JasperReports JasperDecisions JasperAnalysis JasperETL Content Store Data Mart Report Definition JasperServer Rendered Content Reporting Services Metadata Services OLAP Services ETL Services Images, Fonts, etc Meta Data Operational Data Source JasperIntelligence Platform JDBC, POJO, XML, XML/A JasperETL CORPORATE Finance Purchasing Inventory … DATA This slide © 2005 JasperSoft, Inc. www.molecular.com 30
    31. 31. Pentaho Source: Pentaho www.molecular.com 31
    32. 32. OpenI Source: OpenI/Loyalty Matrix www.molecular.com 32
    33. 33. OpenI at a Glance• J2EE Web Application• Standards-based, integrates other Open Source components• Connectors for Relational (JDBC), OLAP (XMLA), and data mining data sets (RServe) currenly only XMLA• Supports Jasper .jrxml and custom RDL• JPivot for Pivot tables, JFreeChart• Supports JSP-168• Form-based authentication with J2EE Security www.molecular.com 33
    34. 34. Bizgres• Sponsored by Greenplum• Bizgres is a distribution of PostgreSQL (Open Source DB)• Bizgres includes the following components: • PostgreSQL 8.1.3 (Open Source RDBMS) • Bizgres Loader (Mass data loading utility) • Demonstration Programs and Utilities • KETL Integration (ETL solution for web log analysis) • JasperReports Integration • Bizgres Clickstream www.molecular.com 34
    35. 35. Bizgres Clickstream Architecture Source: Greenplum www.molecular.com 35
    36. 36. Who leads the pack? Downloads by Month 60000 50000 40000 30000 20000 10000 0 Feb- Mar- Apr- May- Jun- Jul- Aug- Sep- Oct- Nov- Dec- Jan- Feb- Mar- Apr- 05 05 05 05 05 05 05 05 05 05 05 06 06 06 06 Mondrian JFreeReport Pentaho Jasper Reports Source: sourceforge.net www.molecular.com 36
    37. 37. Agenda• What is Business Intelligence?• BI Trends• OSBI Trends• Products • BI suites • ETL tools • OLAP • Reporting tools • Databases• Demo www.molecular.com 37
    38. 38. The State of Open Source Business Intelligence• “Business intelligence” is a broad umbrella term• Lot of buzz in the media and from analysts• Young and growing market• Immature, but rapidly improving products• No clear market leader www.molecular.com 38
    39. 39. Thank you! Q&AWould I go with Open Source BI today? How about you? www.molecular.com 39

    ×