Exploiting the Internet of Thingswith investigative analyticsA White Paper by Bloor ResearchAuthor : Philip HowardPublish ...
“The Internet of Things has thepotential to change the world, justas the Internet did. Maybe evenmore so”Kevin Ashton
1 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsIntroductionThere ...
2© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesThere are...
3 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesconnectio...
4© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesIn summar...
5 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsWhat is requiredBe...
6© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsInfobrightInfobrig...
7 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsConclusionAs Kevin...
Bloor Research overviewBloor Research is one of Europe’s leading ITresearch, analysis and consultancy organisa-tions. We e...
Copyright & disclaimerThis document is copyright © 2013 Bloor Research. No part of thispublication may be reproduced by an...
2nd Floor,145–157 St John StreetLONDON,EC1V 4PY, United KingdomTel: +44 (0)207 043 9750Fax: +44 (0)207 043 9748Web: www.Bl...
Upcoming SlideShare
Loading in …5
×

Exploiting the Internet of Things with investigative analytics

627 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
627
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Exploiting the Internet of Things with investigative analytics

  1. 1. Exploiting the Internet of Thingswith investigative analyticsA White Paper by Bloor ResearchAuthor : Philip HowardPublish date : May 2013WhitePaper
  2. 2. “The Internet of Things has thepotential to change the world, justas the Internet did. Maybe evenmore so”Kevin Ashton
  3. 3. 1 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsIntroductionThere is a wealth of information hidden in theInternet of Things that can help organisationsto understand what happened or might happenand why it happened or may happen, and helpto point towards what to do about it. However,before we consider how to analyse this infor-mation and why it is important to your busi-ness we need to understand what we mean bythe “Internet of Things” and by “investigativeanalysis”.The Internet of ThingsThe Internet of Things was first described byKevin Ashton in 1999. He wrote “computers—and, therefore, the Internet—are almostwholly dependent on human beings for infor-mation. Nearly all of the data available on theInternet was first captured and created byhuman beings—by typing, pressing a recordbutton, taking a digital picture or scanninga bar code. Conventional diagrams of theInternet ... leave out the most numerous andimportant routers of all—people. The problemis, people have limited time, attention andaccuracy—all of which means they are notvery good at capturing data about things inthe real world. And that’s a big deal. We’rephysical, and so is our environment ... Youcan’t eat bits, burn them to stay warm or putthem in your gas tank. Ideas and informationare important, but things matter much more….If we had computers that knew everythingthere was to know about things—using datathey gathered without any help from us—wewould be able to track and count everything,and greatly reduce waste, loss and cost. Wewould know when things needed replacing,repairing or recalling, and whether they werefresh or past their best. The Internet of Thingshas the potential to change the world, just asthe Internet did. Maybe even more so.”Today there are multiple definitions of theInternet of Things but this is as good a place tostart as any: the point is that a) more and morethings (vehicles, smart meters, cell phones,planes, oil rigs, shop floor devices, clickstreamdata, anything with an active RFID tag, and soon) are being or have been instrumented andb) we now have the ability to analyse the infor-mation coming from this instrumentation in acost-effective manner, so that the Internet ofThings is becoming a reality.What does the Internet of Things mean to yourorganisation? Clearly it depends on your busi-ness but in principle it allows you to performwhat we might call investigative analytics:exploring the what, why and how of all thisinstrumented data.Investigative analysisThe term investigative analysis was firstcoined by Curt Monash in 2011 as a function tosupport “research, investigation and analysisin support of future decisions”. He defines itas “seeking (previously unknown) patternsin data”. More specifically he describes it asa “conflation of several disciplines, includingstatistics, data mining, machine learning and/or predictive analytics; together with the moreresearch-oriented aspects of business intel-ligence tools, including ad hoc query, drill-down, most things done by BI-using ‘businessanalysts’, and most things within BI called‘data exploration’; plus analogous technolo-gies as applied to non-tabular data types suchas text or graph.”In other words, you are interested in discov-ering a pattern of past activity that point tosome likely outcome in the future. And youwant to be able to do that across any type ofdata regardless of whether it is transactionalor not. To put this another way: somethinghappened—is this part of a pattern that indi-cates that it might happen again? If so, what isthat pattern and how can we can leverage it forbusiness purposes in the future?In this paper we will explore some of the usecases around investigative analysis and howthat can be applied to the Internet of Things,and then we go on to consider the sort of tech-nology you need to enable this capability. Wewill conclude with a discussion of the solutionprovided by Infobright, a data warehousingvendor that is addressing the market forinvestigative analytics.
  4. 4. 2© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesThere are a great many potential environ-ments where investigative analytics might bedeployed. The following represent a samplingonly and as the Internet of Things becomesmore prevalent it is likely that new use caseswill emerge. However, broadly speaking wecan say that investigative analysis will allowyou to:1. Discover why something went wrong anddeterminewhattodotopreventitgoingwrongin the future—this would apply to things likedropped calls for mobile networks, preven-tative maintenance across various industrysectors, smart meters (ditto), routing of bothtransportation and goods and so on.2. Discover why something went right sothat you can build processes to support anincreased likelihood of things going rightin the future—for example, monitoring andanalysing web traffic or mobile usage toencourage upsell or cross-sell opportunities.Sales of location-based services in mobileenvironments are a particular case in point.3. Plan capacity to support requirements andservice level agreements in the most cost-effective fashion. This applies particularlyin smart metering environments, mobileservices and transportation environments,amongst others, where forecasting andmeeting future demand is essential.It is worth noting, before we move on to discussindividual use cases that a number of thesescenarios require real-time query processing aswell as more in-depth, batch-oriented analytics.Smart metersSmart metering is of increasing interestaround the world. While there are significantimplementations already in the United Statesthe rest of the world is some way behind inthis respect. However that will change: forexample, in accordance with European Unionmarket guidelines, 80% of all householdsin Germany should be equipped with smartmeters by 2020. Smart meters are installedin homes and businesses and feed a steadystream of data to the relevant application,where the data is analysed and the resultsused to efficiently allocate energy resourcesin real time, so that less energy is wasted. Inaddition, information collected from smartmeters can be combined with weather fore-cast and other data (such as major sportingevents or TV programmes) to predict futureenergy requirements so that appropriateresources will be available. If we return tothe German example, some 32 million house-holds will need to be metered by 2020, whichrepresents an enormous amount of eventdata that must be captured, analysed, andacted upon.Banking and financial servicesATMs (automated teller machines) are notdissimilar to smart meters: they provide youwith a service (money, statements and so forth)and they update your account. They have not,historically, been used to collect data in orderto forecast future demand but it is clear thatthere is a shift away from cash and towardsautomated payments of various kinds. Banksare therefore looking to rationalise their ATMnetworks. On the other hand they do now wishto alienate existing customers that still needaccess to cash. Understanding who uses cashmachines and how often will be fundamental toany decisions and, given that you can withdrawcash from your bank at a rival bank’s ATM itwill make sense for banks to collaborate onwhere ATMs should be rationalised.The flip side of the move away from cash is theincrease in mobile payments. This raises twointeresting areas with respect to investigativeanalysis. One is that the less people use cashand the more they use electronic payments theeasier it is to profile those individuals and tounderstand their preferences, which, in turn,can enable better upsell and cross-sell oppor-tunities. Conversely, there are also securityimplications: in particular, the better youunderstand customer spending patterns thebetter you are able to detect the likelihood thata card or mobile phone has been stolen and isbeing used for fraudulent purposes.Network analysis in telecommunicationsFor both performance and planning purposestelcos need to monitor and analyse traffic. Keyelements to determine are ‘hot spots’ withinthe network—areas with particularly highusage—and failures within the network. Aproper understanding of the former, and howthis is developing over time, will be critical tofuture investments in new infrastructure inorder to meet growing demand. Conversely,any failure within the network is an imme-diate problem that needs to be resolved asspeedily as possible. Failures may lead to
  5. 5. 3 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesconnections being dropped (which are wellknown potential indicators of customer churn)or reduced service. The analysis of usagetrends, combined with location-based anddemographic data, will be important for plan-ning future infrastructure investments.In telecommunications there are also a numberof other areas where investigative analysis maybe used, for analysis associated with mobilepayments, location-based services and soforth, as previously discussed.Preventative maintenanceThe average oil platform has 40,000 sensors.A flight across the Atlantic generates over 9TBof data about the status of the plane you are in.Trains and railway tracks abound with sensors.However, this is not limited to transportation:equipment of all sorts, whether on construc-tion sites or the shop floor, has built-in moni-tors and sensors designed to alert operators,pilots or drivers to any problems that mayoccur. However, historically this informationhas been discarded rather than analysed,principally because there were not the toolsavailable to analyse this data in a cost-effectivemanner. With modern technology this is nowchanging and this wealth of information isbeing used to identify patterns of failure (if thiscomponent fails then that one is likely to doso within a certain period) and to predict thefailure of particular elements so that preventa-tive maintenance can avert potential problems.It should be noted that preventative mainte-nance doesn’t just apply to equipment andmachinery of various sorts but also to people.For example, there are a number of profes-sional sports bodies (for example, in football)that monitor their players’ activities on thefield and in training so that they can be restedat appropriate times in order to avoid injury.As a different example, there is a company inAustralia that provides glasses to drivers oftrucks and heavy (mining) equipment whichmonitors how often the driver’s blink, blinkingbeing a sign of tiredness. Not only does thisprevent accidents in the short term (the driverwill be alerted if tiredness is indicated) butsubsequent analysis of the data can helps tooptimise shift patterns and rosters. Telemetryused by motor insurance providers to monitordriving safety is yet another example (thoughthe emphasis here is more on calculatingpremiums) and no doubt there are also appli-cations within healthcare.LogisticsAnywhere where GPS signals are part of abusiness process there is likely to be an appli-cation for investigative analytics. For example,one of the major oil companies that has oilrigs in the Arctic uses GPS tracking informa-tion combined with weather data to predictthe movement of ice floes that can impacton drilling operations. More prosaically, roadtransportation, field service management andsimilar sectors are heavily dependent on trafficpatterns and the location of relevant vehiclesto optimise routing, while container trackinghas similar requirements. In these cases thereare both real-time issues (recognising thatre-routing would be appropriate and doing so)and long-term analytic requirements that willenable better routing in future.Comparable requirements can also apply inretail and manufacturing environments where,instead of using GPS signals, goods or partsare identified by RFID tags. One leading aircraftmanufacturer, for example, makes extensiveuse of this technology for part tracking andoptimisation.Security information and log dataUnlike the other use cases discussed, themanagement and analysis of security informa-tion (who might be attacking your company’sinfrastructure and how) is by no means anew market. The standard approaches tothis market include the use of SIEM (securityinformation and event management) and logmanagement, where the former is a supersetof the latter that includes (near) real-timeidentification of attack vectors as well as thestorage and forensic analysis capabilitiesprovided against log data. The storage of logdata needs to be extremely efficient. Histori-cally, companies used to store only a fewmonths’ worth of data for online analysis.However, the increase in low and slow attacks,or advanced persistent threats (APTs), whichcan spread over not just months but yearsmeans that very efficient storage mechanismsare required that at the same time support thesort of in-depth analytics that are required toidentify patterns of activity. Such patterns maybe fraudulent or, more often, patterns that willenable the identification of threats.
  6. 6. 4© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsUse casesIn summaryWe do not need to belabour the point: almost anywhere that machinesor devices generate information there is scope for investigative analyticsbecause machines always go wrong or are in the wrong place or aredoing something interesting right now that you would like to know about.And you would like to know about it not just so that you can take appro-priate action at this particular moment but also so that you can analyseand predict when this might happen again so that you can prevent it and,thereby, provide a better service to your customers and/or users.
  7. 7. 5 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsWhat is requiredBecause you are potentially going to be storingand analysing a lot of data you will need atechnology that enables you to exploit this datafor business insight—to reduce costs, identifynew revenue streams, and improve competi-tive positioning. However, because the sortsof analytics we are discussing include a real-time component, then a simple batch-basedanalytic environment (such as Hadoop) will notbe sufficient for the fast, interactive queriesneeded. There are therefore a number ofrequirements for such an engine, as follows:1. It must be scalable enough to hold all thedata you need for long-term analytics. Thiswill obviously be dependent on the environ-ment. For example, to determine preventa-tive maintenance characteristics againstthe 40,000 sensors on an oil rig will certainlyrequire months’ and quite possibly years’worth of data, which will be on a differentscale from network analysis in a telecom-munications company which has a limitednumber of masts and only needs to performanalyses across limited periods of time.2. It must be fast enough to ingest the datawithin a reasonable timeframe, dependingon the latency required. That is, you need toable to load the data fast enough to providefor whatever real-time query processing oralerting that is required.3. We are potentially talking about very largequantities of data, notwithstanding thecomments made in paragraph 1. In orderto be able to store this in an economicalfashion you need very efficient compressionof the data so that storage requirementscan be minimised.4. Next, while not necessarily imperative, it islikely that you will not want a system thatrequires no manual tuning or DBA admin-istration such as the creation of indexes.If you have to index the data as it is loadedthis will significantly slow down the loadingprocess and it will add to the size of thedatabase, not to mention adding to admin-istrative costs.5. Finally, the actual time taken to processqueries needs to be fast enough to meetservice level requirements, especiallybearing in mind that this may involvecomplex analytics. In addition, it is prob-able that you will wish to run ad hoc queriesagainst the data as well as running standardreport and analytic processes, so the data-base will need to be fast enough and flex-ible enough to support this in an efficientmanner. Databases that use indexes orother constructs, such as projections, toachieve fast query performance will notusually provide good enough performancefor unplanned (ad hoc) queries.6. The sorts of applications we are discussingare often mission-critical as they supportthe real-time operations of your organisa-tion. It is therefore necessary that any solu-tion is at least highly available (caters forunplanned downtime without stopping) and,preferably, that it is continuously available(caters for planned downtime as well asunplanned stoppages).Of course there are also more generic require-ments such as simple and quick implementa-tion, low costs (both direct and indirect),minimal administration and so on.
  8. 8. 6© 2013 Bloor Research A Bloor White PaperExploiting the Internet of Things with investigative analyticsInfobrightInfobright is a provider of analytic databasetechnology that comes in three flavours:enterprise-class appliance configurations,software-only installations, and embeddedOEM implementations. At its core, Infobright isa columnar database initially built on MySQL.Column-oriented databases are better suitedfor analytics than row-based databases since,unlike transaction processing environments,it is commonly the case that only a limitedsubset of columns are required from eachrecord. By grouping the data together inthis way, the database only needs to retrievecolumns that are relevant to the query, greatlyreducing the overall I/O. Being column-basedalso has the advantage of providing improvedcompression, which further reduces storageand improves performance.However, Infobright goes beyond the conven-tional use of columns to provide even betterperformance, better compression, andreduced administration through the use ofits Knowledge Grid. This is based on theconcept of Data Packs. The data within eachcolumn is stored in 64K item groupings calledData Packs. The use of Data Packs improvesdata compression as the optimal compres-sion algorithm is applied based on the datacontents. According to Infobright, an averagecompression ratio of 10:1 is achieved afterloading data into Infobright (though manyusers see compression of 40:1 and more). Atthe same time the software creates metadataabout the contents of each Data Pack as it isbeing loaded. This metadata is stored in theKnowledge Grid. The Knowledge Grid containsinformation about the contents of each DataPack as well as the relationships betweenData Packs, which are automatically createdand stored. This includes a set of statistics andaggregate values of the data from each DataPack, such as MIN, MAX, SUM, AVG, COUNT,and Number of NULLs. A further set ofmetadata describing ranges of numeric valueoccurrences and character positions, as wellas column relationships between Data Packs,is also stored.As a query comes in, Infobright uses the infor-mation in the Knowledge Grid to determinewhich Data Packs are relevant to the querybefore decompressing any data. In many cases,the summary information already contained inthe Knowledge Grid is sufficient to resolve thequery, and nothing is decompressed. Workingtogether, the Data Packs, Knowledge Gridand Infobright’s iterative computing engine(Granular Computing Engine) should ensurefast, consistent query performance even whendata volumes increase dramatically. Needlessto say, the Knowledge Grid is automaticallyupdated whenever the database is updated.Note that, thanks to the Knowledge Grid,Infobright does not require you to partition orindex the data. This not only reduces admin-istration but it also prevents data skew, whichis a performance problem for vendors usinghorizontal (row-based) partitioning and whichforces re-balancing of the database.In addition, by eliminating the need to partitionthe data, Infobright delivers support for ad hocqueries, which are a foundational require-ment for investigative analytics. The reasonfor this is that if you partition (or shard) yourdata, you limit the way that you can accessthe data: if your query matches the way youhave partitioned the data then your querieswill perform well—but if they don’t then theywon’t. In other words, partitioning works bestwhen you know in advance what queries youare going to ask: which is the antithesis of adhoc and self-service query processes. By notneeding to partition data, Infobright ensures aconsistent level of query performance regard-less of the nature of the query (assumingequal complexity).In so far as loading is concerned this can runat up to TBs per hour within a multi-machineloader configuration with Infobright. Manycustomers use Infobright in a highly dynamicproduction environment where new data needsto be loaded and accessed within minutes fornear-real-time analytics.
  9. 9. 7 © 2013 Bloor ResearchA Bloor White PaperExploiting the Internet of Things with investigative analyticsConclusionAs Kevin Ashton wrote, back in the last century, “The Internet of Thingshas the potential to change the world, just as the Internet did. Maybeeven more so.” It has taken more than a decade but the Internet ofThings is here. It isn’t yet as widely implemented as it will be and itwill take a while before its full impact is felt, both at a business leveland in our daily lives, especially as it is exploited through the use ofinvestigative analytics. But make no mistake: it is here and it is growing.From a business perspective this has very significant repercussions,with the addition of investigative analytics the Internet of Things willenable substantial steps forward in customer service in the present,and in business planning for the future. Like all major technologychanges this combination of capabilities offers both opportunities andthreats and there will be winners and losers. The winners will be thosethat grasp these new technologies and use them to enhance and expandtheir business.Further InformationFurther information about this subject is available fromhttp://www.BloorResearch.com/update/2170
  10. 10. Bloor Research overviewBloor Research is one of Europe’s leading ITresearch, analysis and consultancy organisa-tions. We explain how to bring greater Agilityto corporate IT systems through the effectivegovernance, management and leverage ofInformation. We have built a reputation for‘telling the right story’ with independent,intelligent, well-articulated communicationscontent and publications on all aspects of theICT industry. We believe the objective of tellingthe right story is to:• Describe the technology in context to itsbusiness value and the other systems andprocesses it interacts with.• Understand how new and innovative tech-nologies fit in with existing ICT investments.• Look at the whole market and explain allthe solutions available and how they can bemore effectively evaluated.• Filter “noise” and make it easier to find theadditional information or news that supportsboth investment and implementation.• Ensure all our content is available throughthe most appropriate channel.Founded in 1989, we have spent over twodecades distributing research and analysis toIT user and vendor organisations throughoutthe world via online subscriptions, tailoredresearch services, events and consultancyprojects. We are committed to turning ourknowledge into business value for you.About the authorPhilip HowardResearch Director - Data ManagementPhilip started in the computer industry way backin 1973 and has variously worked as a systemsanalyst, programmer and salesperson, as wellas in marketing and product management, fora variety of companies including GEC Marconi,GPT, Philips Data Systems, Raytheon and NCR.After a quarter of a century of not being his own boss Philip set up hisown company in 1992 and his first client was Bloor Research (then­ButlerBloor), with Philip working for the company as an associateanalyst. His relationship with Bloor Research has continued since thattime and he is now Research Director focused on Data Management.Data management refers to the management, movement, governanceand storage of data and involves diverse technologies that include (butare not limited to) databases and data warehousing, data integration(including ETL, data migration and data federation), data quality, masterdata management, metadata management and log and event manage-ment. Philip also tracks spreadsheet management and complex eventprocessing.In addition to the numerous reports Philip has written on behalf of BloorResearch, Philip also contributes regularly to IT-Director.com and­IT-Analysis.comandwaspreviouslyeditorofboth“Application­DevelopmentNews” and “Operating System News” on behalf of Cambridge Market Intel-ligence (CMI). He has also contributed to various magazines and written anumber of reports published by companies such as CMI and The FinancialTimes. Philip speaks regularly at conferences and other events throughoutEurope and North America.Away from work, Philip’s primary leisure activities are canal boats,skiing, playing Bridge (at which he is a Life Master), dining out andwalking Benji the dog.
  11. 11. Copyright & disclaimerThis document is copyright © 2013 Bloor Research. No part of thispublication may be reproduced by any method whatsoever without theprior consent of Bloor Research.Due to the nature of this material, numerous hardware and softwareproducts have been mentioned by name. In the majority, if not all, of thecases, these product names are claimed as trademarks by the compa-nies that manufacture the products. It is not Bloor Research’s intent toclaim these names or trademarks as our own. Likewise, company logos,graphics or screen shots have been reproduced with the consent of theowner and are subject to that owner’s copyright.Whilst every care has been taken in the preparation of this documentto ensure that the information is correct, the publishers cannot acceptresponsibility for any errors or omissions.
  12. 12. 2nd Floor,145–157 St John StreetLONDON,EC1V 4PY, United KingdomTel: +44 (0)207 043 9750Fax: +44 (0)207 043 9748Web: www.BloorResearch.comemail: info@BloorResearch.com

×