Hadoop World 2011: The State of Big Data Adoption in the Enterprise - Tony Baer - Ovum


Published on

As Big Data has captured attention as one of “the next big things” in enterprise IT, most of the spotlight has focused on early adopters. But what is the state of Big Data adoption across the enterprise mainstream? Ovum recently surveyed 150 global organizations in a variety of vertical industries with revenue of $500 million+ and manage large enterprise data warehouses. We will share the findings from the research in this session. We will reveal similarities in awareness, readiness, and business drivers when compared to early data warehousing adoption back in the mid 1990s. We will discuss how early experience with emergence of the data warehousing adoption can provide a roadmap for proceeding with Big Data implementations in the next 2-5 years.

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Apologies about the busy – but we’ll just focus on a few highlights. Weighted on a scale 1 to 4 not important to very important Unsure’s were a small% of responses. Some interesting patterns show up here – from sectors that had the largest representation in the sample. Among the highlights, we saw peaks of interest in: Text remains by far the most popular data type – as text analytics are fairly well-established practice. Ditto with rich media. Time series data from retail and financial services, which are highly transactional event-intensive businesses. Surprisingly, also interest from public sector. Surprisingly, healthcare was the most interested in social media. This indicates that healthcare providers are looking for data from patients themselves to fill a key missing link in quality-of-care (and reputational) data: data that comes from the patient or their family or friends. Surprisingly, very little interest in web logs – even from retail. Similarly, there was lesser interest in gathering and analyzing sensory data. As a group, these specific sectors had higher levels of interest in most of these data types compared with the entire sample. That discrepancy is likely a freak of sampling; because these groups were more highly represented, our survey team probably reached more organizations with active interest. This data hints at the differences that are likely to emerge as use of Big Data analytics matures. For instance, gas, water, and electric utilities have vital stake in monitoring devices on their transmission systems and at the points of consumption – especially if they are implementing smart metering grids or similar programs to manage resource demand. We would not expect the same demand from financial services. Similarly, manufacturers, retailers, and logistics providers have clear interest in tracking onboard GPS devices and external feeds regarding road and weather conditions for tracking trucking deliveries. Ovum believes that as Big Data analyses become more commonplace in the enterprise mainstream, that vertical industry differences regarding data types will eventually emerge. For instance: graph data will be most useful where the need is for tracking customer sentiment based on the interactions of social groups, or for traffic flow analysis that may be used by municipal transport agencies or trucking companies weblogs will be used by online sites seeking to understand and drive incoming traffic rich media data will be useful for a broader range of segments, such as entertainment companies that tag and track media assets and security and law enforcement agencies that track terrorist or criminal activity social media data will be useful for any sector that is consumer- (or, in the case of public sector, voter-) driven as it provides, in effect, the world's largest virtual focus group text and document data will be useful for organizations subject to regulatory compliance mandates.
  • Supply chain under-represented because only half the respondents have supply chains; the other half were from the service industries Sales & marketing highlighted because it came from the 80% who were private sector.
  • When you add up the numbers, the total comes out to 45% of respondents; however, as a significant subset might be implementing multiple Advanced SQL platforms, the actual proportion of respondents will likely be lower.
  • Reveals a lot of development of analytics as web apps, where web languages used in conjunction with (not instead ofg) SQL on back end.
  • Weighted on a scale of 1 to 4
  • Hadoop World 2011: The State of Big Data Adoption in the Enterprise - Tony Baer - Ovum

    1. 1. The State of Big Data Adoption in the Enterprise Tony Baer [email_address] November 2011
    2. 2. <ul><li>Big data is not limited to big companies </li></ul><ul><li>Variety trumps volume </li></ul><ul><li>Implementation – The rules on the ground </li></ul><ul><li>Project/budget plans </li></ul>Agenda
    3. 3. Survey sample by region Europe North America APAC 27% 33% 40% Sample size = 150 organizations Minimum > 1+ TByte data in enterprise DWs/analytic data stores
    4. 4. Company size & IT budget <$10m Don't Know $50m+ $10m - $50m 9% 7% 7% 77% Annual IT budget $50m - $250m <$10m >$1bn 22% 18% 17% 21% 22% Company size Big data is not limited to big companies! $250m -$1b $10m - $50m
    5. 5. Survey sample by vertical industry Other Media & Entertainment Mfg Public sector (government) ICT Healthcare Financial services Retail 5% 16% 13% 5% 21% 20% 5% 9% Transportation 6%
    6. 6. <ul><li>Big data is not limited to big companies </li></ul><ul><li>Variety trumps volume </li></ul><ul><li>Implementation – The rules on the ground </li></ul><ul><li>Project/budget plans </li></ul>Agenda
    7. 7. Mean analytic data store size Number of respondents 1 150 1000 3TB = mean 75 3000 5000 7000 Terabytes 2000 4000 6000 8000
    8. 8. Analytic data store size by data type 0% 10% 20% 30% No unstructured data 1 - 5 TBytes 5 - 10 TBytes 10 - 20 TBytes 20 - 50 TBytes 50 - 100 TBytes 100 - 500 TBytes 500 - 1000 Tbtyes Over 1000 TBytes Structured data Unstructured/ Variably structured data
    9. 9. Popular analytic data types 0% 30% 60% 90% B2B transactional Supply Chain Mgmt Call Detail Records (CDRs) Internet search indices Legal/regulatory documentation Legacy Apps CRM ERP Email/Messaging Currently Planned
    10. 10. Variably structured data – current & future demand Weighted score Social Media Web Logs Sensory Graph Rich media Text Time series All sectors Financial Svcs Healthcare Retail Public Sector Healthcare Healthcare Healthcare Healthcare FS FS Retail Retail Healthcare Public Sector FS FS FS Healthcare
    11. 11. <ul><li>Big data is not limited to big companies </li></ul><ul><li>Variety trumps volume </li></ul><ul><li>Implementation – The rules on the ground </li></ul><ul><li>Project/budget plans </li></ul>Agenda
    12. 12. Business objectives Big Data analytic projects change the analytics, but not the objectives 0% 30% 60% Other Advanced analytics Competitive positioning Business agility Regulatory compliance ID hidden business trends Predictive analytic insights Business forecasting Customer service Strategic decision making Operational decision making
    13. 13. Business sponsors The players are currently the same 0% 30% 60% Supply chain Other Customer management Internal operations Sales and marketing Finance
    14. 14. Advanced SQL analytic database use A 5-year head start on NoSQL 0% 2% 4% 6% 8% 10% 12% Oracle Exadata Sybase IQ Aster Data IBM Netezza Sand Greenplum Kognitio Teradata ParAccel Vertica Infobright
    15. 15. NoSQL platform use Considering Testing/ Evaluating In Production 0% 2% 4% 6% 8% 10% 12% Amazon SimpleDB Hadoop MongoDB Membase Cassandra CouchDB
    16. 16. Non-SQL languages/frameworks for analytic queries 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Java None PHP C or C++ Other language Perl Other framework/technology? Python Ruby MapReduce Now Future
    17. 17. Big Data in the cloud? No Yes Considering in next 12 months Considering in next 12 - 36 months 24% 9% 9% 58%
    18. 18. Technical concerns Analytic/NoSQL databases technology immaturity Current DW investments not easily extended In-house analytic skills Increased capital investment Visualizing large datasets Technology infrastructure costs Analysis requires additional IT resources Data integration complexity Systems & network performance impact High data volatility/refresh cycles Information filtering technology Data storage issues & costs Performance issues & query response times Data quality & governance Not Important Important
    19. 19. <ul><li>Big data is not limited to big companies </li></ul><ul><li>Variety trumps volume </li></ul><ul><li>Implementation – The rules on the ground </li></ul><ul><li>Project/budget plans </li></ul>Agenda
    20. 20. Who will deliver Big Data technologies & solutions? Plenty of room for new blood 0% 20% 40% 60% Existing data warehousing/ BI analytics supplier In-house Systems integrator Specialist provider Other technology provider
    21. 21. Big Data IT budget plans Based on 25% of respondents provided actual numbers Current IT budget Next year's IT budget Next 2 - 5 yrs Under $100,000 $100,000 - $1 million 1 - 5 million Over $5 million
    22. 22. Big Data budget plans 0% 10% 20% 30% 40% 27% 33% 44% Current IT budget Next year's IT budget In next 2 - 5 years Proportion of respondents
    23. 23. Thank you – any questions? Tony Baer Email: [email_address] Twitter: @TonyBaer All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher, Ovum (an Informa business). The facts of this report are believed to be correct at the time of publication but cannot be guaranteed. Please note that the findings, conclusions and recommendations that Ovum delivers will be based on information gathered in good faith from both primary and secondary sources, whose accuracy we are not always in a position to guarantee. As such Ovum can accept no liability whatever for actions taken based on any information that may subsequently prove to be incorrect.