• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Open Source Solutions: Managing, Analyzing and Delivering Business Information
 

Open Source Solutions: Managing, Analyzing and Delivering Business Information

on

  • 3,708 views

These slides on the usage of open source solutions within the business intelligence and data warehousing market go with a webcast and research report. The webcast is archived at http://ow.ly/KLz0 ...

These slides on the usage of open source solutions within the business intelligence and data warehousing market go with a webcast and research report. The webcast is archived at http://ow.ly/KLz0 along with a PDF of the report, This presentation describes what open source software is being deployed and presents the benefits, challenges and practices for organizations adopting open source technologies.

Statistics

Views

Total Views
3,708
Views on SlideShare
3,665
Embed Views
43

Actions

Likes
4
Downloads
173
Comments
0

3 Embeds 43

http://www.slideshare.net 21
http://www.francescoarcieri.it 20
http://www.techgig.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Open Source Solutions: Managing, Analyzing and Delivering Business Information Open Source Solutions: Managing, Analyzing and Delivering Business Information Presentation Transcript

    • Open Source Open Source Leveraging Solutions: Managing, Analyzing andAcross Business Intelligence Delivering Business Information Your Organization MarkR. Madsen – November 2009 Mark R. Madsen – February 2009 www.ThirdNature.net www.ThirdNature.net
    • The First Recorded Patent February 2009 Mark Madsen Slide 2
    • The First Monopoly February 2009 Mark Madsen Slide 3
    • The Origin of Copyright • 1556: The Worshipful Company of Stationers and Newspaper Makers is granted a Royal Charter, giving it a monopoly over the publishing industry until … • 1710: “An Act for the Encouragement of Learning, by vesting the Copies of Printed Books in the Authors or purchasers of such Copies, during the Times therein mentioned”, otherwise known as the Statute of Anne, put the put the rights into the hands of authors February 2009 Mark Madsen Slide 4
    • After Each Revolution, the Old Pirates Become the New Establishment Pirate Establishment February 2009 Mark Madsen Slide 5
    • What is Commercial Software, Really? February 2009 Mark Madsen Slide 6
    • What Makes Software Open Source? Academic LIcenses Reciprocal Licenses “Freeware” Licenses The fuzzy dividing Commercial line between open Licenses and closed source More freedom Less freedom February 2009 Mark Madsen Slide 7
    • Some Quick Definitions Proprietary Software Software under a license that provides limited usage rights only, provided in binary format. Open Source Software (OSS) Software under a license that allows acquisition, modification and redistribution. Freeware Software that does not have licensing limitations, generally distributed in binary format. Not the same as open source. February 2009 Mark Madsen Slide 8
    • Fauxpen Source Something appearing with greater frequency as open source becomes more popular and lower tier proprietary vendors seek a differentiator. February 2009 Mark Madsen Slide 9
    • Evolution of the Software Market 1987 Source: John Prendergast (data: Bloomberg, Factset) February 2009 Mark Madsen Slide 10
    • Evolution of the Software Market 1997 Source: John Prendergast (data: Bloomberg, Factset) February 2009 Mark Madsen Slide 11
    • Evolution of the Software Market 2007 Source: John Prendergast (data: Bloomberg, Factset) February 2009 Mark Madsen Slide 12
    • The DW & BI Software Market Today According to IDC, the analytics and data warehouse software market is growing at 31,595 10.3% CAGR 28,682 26,001 23,601 21,408 19,342 17,386 2005 2006 2007 2008 2009 2010 2011 February 2009 Mark Madsen Slide 13
    • Any Industry This Big is Maturing Annual US software sales 150 130 110 90 70 50 30 10 -10 70 75 80 85 90 95 00 Source: US Dept. of Commerce February 2009 Mark Madsen Slide 14
    • “If the automobile had followed Reality the same development as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and explode once a year killing everyone inside.” Anything Robert Cringely Time
    • Software Revenue = Corporate IT Cost IT costs as a percent of equipment investment 50 40 30 20 10 0 68 72 76 80 84 88 92 96 00 04 Source: US Dept. of Commerce February 2009 Mark Madsen Slide 16
    • Open Source is an Inevitable Consequence If the means of production is widely distributed at commodity cost And the internet connects all those means of production And the supply of any software program is infinite Then we need to rethink some things. “The era of high capital industrial production is giving way to a different model.” – Peter Drucker February 2009 Mark Madsen Slide 17
    • A Perfect Commodity Changes Things Open source is a means of production and distribution of software, and is driving change in the market. But the fact that the internet is a massive copying machine for the perfect commodity is the real change in conditions. The basis of open source is economics, not ideology. February 2009 Mark Madsen Slide 18
    • The Real State of Enterprise Software? February 2009 Mark Madsen Slide 19
    • Enterprise Software Economics The enterprise software model is breaking down. Some facts: • 70% - 80% of sales & marketing is for new sales • 76% of new license revenue goes to sales & marketing • Maintenance makes up 45% of revenues and this number is increasing • 75% of R&D for mature products is for updates, bug fixing, and non- revenue enhancements • Maintenance and support is becoming the biggest factor is software company profitability. Sources Godman-Sachs, Tech Strategy Partners, Forrester February 2009 Mark Madsen Slide 20
    • Open Source Disruption “Which sector of the industry is most vulnerable to disruption by open source in the next five years?” 1. Web publishing and content management 2. Social software 3. Business Intelligence Source: North Bridge Venture Partners February 2009 Mark Madsen Slide 21
    • BI is Entering Mainstream Adoption The BI market has lots of segments, most new, some mature, some being rejuvenated. Reporting Databases & Analysis Platforms Data Integration Predictive analytics February 2009 Mark Madsen Slide 22
    • Maturity for OSS Components of the Stack Dashboards & Scorecards Visualization Information delivery Analytics / OLAP clients Predictive Analytics Interactive Reporting GIS & location Standard Reporting Modeling Portal Search/Discovery Workflow Information Management DW/Mart/ODS OLAP servers MDM* Data Quality Integration Management ETL EII EAI EDR Metadata Infrastructure Servers Operating Systems Databases February 2009 Mark Madsen Slide 23
    • Interest in and Use of Open Source Database 18% 13% 18% 29% 22% Data integration and ETL 18% 12% 17% 31% 22% Business intelligence 14% 8% 22% 37% 19% Advanced analytics 5% 8% 18% 43% 26% In production Prototype or pilot Evaluating Considering No plans Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 24
    • Database Use MySQL 75% Postgres 44% Infobright 11% EnterpriseDB 10% BerkeleyDB 8% Ingres 7% Firebird 7% Palo 3% CouchDB 3% SQLite 3% MonetDB 3% LucidDB 2% Kickfire 2% Bizgres 2% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 25
    • Data Integration Tool Use What’s popular Pentaho DI / Kettle 42% Talend 33% Jitterbit 13% DataCleaner 8% Red Hat Teiid 5% Apatar 5% What it’s being used for OSDQ 2% Open Data Quality 2% Batch ETL for a data warehouse or mart 30% Clover 2% Operational integration 21% Data migration efforts 15% Data quality efforts 15% Master data management efforts 10% Low‐latency ETL for a data warehouse or mart 8% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 26
    • BI Tool Use What’s popular Pentaho 47% Jaspersoft 28% Mondrian 26% BIRT 19% Jfree 14% SpagoBI 9% What it’s being used for Openl 5% Static reports 20.7% MarvelIT 5% Palo 2% Dashboards or scorecards 17.1% OpenReports 2% End user or interactive reporting 16.5% Reporting against an application database 15.9% Reports embedded in an application or website 15.2% OLAP 14.6% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 27
    • Advanced Analytics Use R 46% Weka 42% RapidMiner 23% Knime 8% Graphviz 8% Orange 7% Processing 4% Axiis 4% Taverna 3% Cytoscape 2% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 28
    • Usage of the tools 53% 50% Database Data Integration BI Adv. Analytics 41% 36% 25% 18% 18% 18% 16% 15% 14% 14% 14% 13% 11% 10% 10% 10% 8% 7% Replacing proprietary  Replacing internally  Supplementing a  Adding new  Using as part of a  software developed software system with similar  functionality to an  new system or  features existing system project Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 29
    • Who’s Adopting Open Source for BI/DW? 1. The under-budgeted 2. ISVs 3. The under-served 4. The over-served 5. Developers who never had it before More co-existence and use in edge cases than straight replacements, and often competing with lack of use. February 2009 Mark Madsen Slide 30
    • Adoption by Organization Size February 2009 Mark Madsen Slide 31
    • Adoption by Size of Organization Small 32% Using Medium 23% Small Large 23% Medium Small Large 37% Medium Evaluating 41% Large 38% Medium and large are the two biggest evaluators, with small using the most in production. Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 32
    • Scope of System Deployment Small Medium Large 40% 38% 35% 32% 27% 27% Department or Division Corporate‐wide Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 33
    • Open Source Purchasing 54% No purchasee 38% 36% Maintenance or support contract 30% Small Training 29% 23% Consulting or installation services 14% 13% Phone, email or on‐site support from the vendor 53% Commercial license 38% 28% Phone, email or on‐site support from a third party 28% Subscription to value‐added, enterprise features Medium 22% 31% 9% 31% 58% 45% 52% 33% Large 24% 33% 6% 21% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 34
    • Where Are People Getting Information? Online articles 53% Online documentation / wikis 53% White papers 48% Online demos 47% Community forums 47% Web seminars or screencasts 37% Blogs 37% Vendor evaluation / trial support (free) 32% Print articles 29% Web‐based training 28% Third party books or documentation 27% Vendor support, paid or as part of a subscription 20% Outside consultant or systems integrator 19% Software features in a paid "professional" version of the software 17% Pre‐bundled software (e.g. a database packaged with a BI tool) 16% Classroom training 14% Support from a third party 14% Internet relay chat (IRC) 7% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 35
    • Why Consider Open Source? IT is after one of three things: February 2009 Mark Madsen Slide 36
    • Rationale When Evaluating OSS Lower cost and reducing vendor risk are the two big reasons. Lower acquisiton costs 66% Open standards 48% Reduced dependence on a vendor 44% Lower maintenance costs 43% Flexibility in deployment 33% Speed of innovation of the software 32% Easier to evaluate or procure 32% Open development process and road  … 32% Extensibility, customizability of software 28% Access to the source code 28% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 37
    • Good News: It Works The benefits are largely being realized. Lower costs 69% Ease of integration / open standards 43% Reduced dependence on vendor 40% Flexibility in deployment 36% Freedom from vendor lock‐in 34% Access to the source code 33% Extensibility / customizability of software 32% Speed of innovation of the software 30% Quicker turnaround on bug fixes 22% Better performance 12% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 38
    • Reduced Vendor Dependence Avoid vendor imposed upgrade cycles February 2009 Mark Madsen Slide 39
    • Why did the software evaluations fail? Missing or incomplete features 72% Scalability problems 34% Required more internal expertise than expected 32% Difficulty integrating into current environment 29% Difficulty finding available solutions 28% Reliability problems 25% Lack of available consulting 21% Interoperability problems 19% Higher costs than anticipated 18% Lack of vendor service or support 16% The biggest reason is maturity of the software. Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 41
    • Data Size, All Database Types Source: Third Nature Open Source BI/DW adoption survey 67% of the 24% sample < 1TB 15% 14% 14% 13% 4% 3% 1% Less than  50 to  100 to  500GB to  1 to <5TB 5 to <20TB 20TB to  More than  50GB <100GB <500GB <1TB 50TB 50TB February 2009 Mark Madsen Slide 42
    • Performance problems Poor interactive BI or analytics performance 69% Poor performance loading data 37% Poor ETL or data integration performance 33% Poor batch reporting performance 33% Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 43
    • Solving Performance Problems Replace every single thing before the database? Database or application tuning 38% Buy more powerful hardware 34% Change BI or analytics tools 32% Redesign the ETL or data integration 32% Limit the amount of data stored in the system 30% Rewrite the BI application or reports 26% Change ETL or data integration tools 18% Limit the number of users accessing the system 18% Migrate to an analytic database 10% Buy a specialized accellerator 8% Migrate to a different traditional database 4% Migrating to an analytic database is twice as likely as to another row-store database. Source: Third Nature Open Source BI/DW adoption survey February 2009 Mark Madsen Slide 44
    • Discontinuity Drives Open Source BI Use The situations most appropriate to open source BI tools often involve discontinuous change. • New interface requirements • New integration requirements • Platform change • Schema change • Data latency / real-time requirements • Segmenting the user population The data warehouse is becoming much more diverse – one BI vendor can no longer be expected to provide tools for all needs. February 2009 Mark Madsen Slide 45
    • First Thought is Often “Replace” February 2009 Mark Madsen Slide 46
    • Coexist is More Likely Than Replace February 2009 Mark Madsen Slide 47
    • Augment is Also More Likely February 2009 Mark Madsen Slide 48
    • Recommendations 1. Don't focus solely on cost savings. People did not mention as up-front reasons many of the benefits they discovered later. 2. Plan to augment, not replace, existing software with open source. Rather than trying to saving money by replacing software, look at gaps in the BI portfolio or data warehouse stack and use open source to supplement your systems. February 2009 Mark Madsen Slide 49
    • Recommendations 3.Consider developing open source policies. Most organizations are adopting open source in an ad-hoc fashion, project by project. 4. Evaluate open source like any other software. It doesn't matter if the software is free if it takes longer to build, manage and deploy solutions to end users, if it is unstable, or if it is missing a key feature 5. Make open source the default option. When there are no internal tools, open source should be the first alternative. February 2009 Mark Madsen Slide 50
    • “When a new technology rolls over you, you're either part of Questions? the steamroller or part of the road.” – Stewart Brand February 2009 Mark Madsen Slide 51
    • Creative Commons Thanks to the people who made their images available via creative commons: glassblower - http://flickr.com/photos/cazasco/261229878/ canal - http://flickr.com/photos/mcsixth/150749007/ rc toy truck.jpg - http://flickr.com/photos/texas_hillsurfer/2683650363/ asymmetry_building_tokyo.jpg - http://flickr.com/photos/fukagawa/2004102417/ beer_free_beer2.jpg - http://flickr.com/photos/fzero/173386050 beer_free_beer3.jpg - http://flickr.com/photos/henrikmoltke/142750871/ condiments_salsa.jpg - http://flickr.com/photos/uberculture/2462506722/ london modern and ancient together.jpg - http://www.flickr.com/photos/cc_chapman/299509390/ firemen not noticing fire.jpg - http://flickr.com/photos/oldonliner/1485881035/ acapluco_cliff_divers_cc.jpg - http://flickr.com/photos/raveller/ highway storm.jpg - http://flickr.com/photos/areyoumyrik/235230688 Tenessee chicken - http://www.flickr.com/photos/mayhem/2495739721/ February 2009 Mark Madsen Slide 52
    • About the Presenter Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.