Big data is a popular terms used to describe the exponential growth, availability and use of information, both structured and unstructured. Various detailed definitions exist to clarify this description you’ve probably heard them all today
But just in case you haven’t big data tends to have 3, 4 or 5 characteristics:• Volume … For example, 340m tweets are sent per day – nearly 4,000 tweets per second , more than 5bn people are calling texting tweeting and browsing websites on mobile phones, 10,000 payment card transactions are made every second around the world Facebook has more than 901 million active users – I could go on yep, we’re drowning in data • Variety – the spice of life – so with the explosion of smart devices, data an organisation has is now more varied and includes semi structured and unstructured data from web pages , email and so on • Velocity – the speed with which the data is being generated and needs to be processed – specifically what ibm defines as data “in motion” rather than “at rest” And if you’re Sas there are 2 more characteristics• Variability so looking at the peaks and troughs of data – is there a daily, weekly or seasonal trend going on? • Complexity – so not just lots of it, from lots of places but also multiple differing sources According to the experts small data is gone – data is going to get bigger and bigger and bigger and people will just have to think differently about how they manage it
So in theory now you can:• Recalculate entire risk portfolios in minutes and understand future possibilities to mitigate risk• Mine customer data for new insights that drive new strategies for customer acquisition, retention and next best offers• Quickly identify the customers who matter the most • Send tailored recommendations to mobile devices at just the right time, whilst target customers are in the right location• Analyse data from social media to detect new market trends Again I could go on –to miss quote Michael Fish - isn’t data brilliant?
But with the growth of so much data comes a whole new set of challenges and many businesses are concerned that the amount of amassed data is becoming so large that it is difficult to find the most valuable pieces of information.• What if your data volume gets so large and varied you don’t know how to deal with it?• Do you store all your data?• Do you analyse it all? • How can you find out which bits are the most important?• How can you use this knowledge to you best advantage?
Interestingly some of these challenges were put by the Economist Intelligence Unit to a number of senior executives during 2011 and then again in the early part of 2012What was interesting about the 2011 survey was that 50% of respondents said that data had become an important factor for their business, with nearly 10% saying that data had completely changed the way their company worked. That’s a good thing, but the report also found that many companies struggle with the basic aspects of data management and with their attempts to exploit their data effectively. Hold that thoughtIn 2012 sponsored by SAS the Economist Intelligence Unit conducted another survey and a few more interesting facts emerged. Firstly many companies are aware of the power of big data but are not exploiting the data they collect. Having said that the top performing companies financially were placing a higher premium on their data than their peers Secondly and I think probably even more important is that, tempting as it is to think that technology can transform a business, companies first need to recognise the problem they want to resolve. In other words big data can only work its magic if a business puts a well-defined data strategy in place BEFORE it starts collecting and processing information. That strategy should be based on key business priorities – with the data component developed afterwards . From the Economist’s survey 46% of executives from companies that significantly outperform their peers financially said they had a well defined data strategy more than four times for those on a par with their peers. The third point from the Economists survey and this is really critical – talent matters as much as the technology. It can’t be confined to the IT department. Analytical thinking has to be across the organisation with everyone thinking about how data can improve performance and with the help of data experts, transforming those thoughts into actions To do this requires more than a knowledge of computer programming and statistics. Data professionals are now required to understand a companies business priorities and competitive environment so they can exploit the data and ask the right questions. Finding such people is not easy. In the Economists survey 41% said a lacked of skilled staff hampers their attempts to process data more efficiently and speedily. Fourthly, Data from web tracking technologies and Social media networks are becoming really important – many companies are already collecting web data and it will be the ability to use this data well that is likely to make the most difference.
Good examples of this come from EMI who spotted something interesting in 2011 that new artists had a strong following young people but little recognition in other demographic groups. How to market certain artists was something that was mostly confined to managers who knew a lot about the industry but not about data. Now EMI use a dataset that contains over 1 million customer interviews over 25 countries with each one generating over 100 pieces of data – the results are then matched to a stream of data from Spotify which has 15m tracks and 3m paying users. Anonymised data on every track a user listens to combined with the EMI interviews enables EMI to track an artists popularity amongst different demographic groups, so they then target their marketing spend accordinglyThe Association of British Insurers recently revealed that fraudulent claims on pet insurance rose to £2m in 2010 from £460k in 2009 in some cases involving pets that didn’t exist. Social network data analysis may reveal who is or who is not a pet owner during claims handling by checking for blog references to an animal – for many the challenge is about looking at data differently and exploiting different types of dataSome companies are considering how they can use the big data they are collecting for secondary revenue purposes. For example a taxi firm or delivery company might install GPS in its vehicles so it can track its operations. As a by product the company now has reams of data on vehicle movements, accidents and road conditions. So this data might be useful for insurance companies to assess the risk with different roads Sat Nav companies can use tracking data to determine the quickest route between two locations at different times of the day – perhaps the data could be used by local Government for planning purposes.
So my conclusion from all this is that …… all this data can be truly transformational to a company but you need to know what you want it for and you have to have people who can truly help you get at it.Put another way , you sort of have to know what you’re capable of doing now with your data, regardless of whether its big or small data and what you’re not and then work out where you want to be. So how can you find out how data capable your business is and whether Big Data is something you’re organisation is ready for?
Well we’ve designed a model that helps companies to establish their data maturity and thus how capable they are of really benefiting from the information that can be gleaned from Big Data. If you’re not yet maximising the data you have then maybe delving into Big Data is not for you. This model tracks a number of points on the Data maturity Curve so that back to the Economists survey – an executive team can determine whether they’re ready to embrace Big Data – or indeed Small data with any degree of capability.
We do this gathering unstructured and structured data – essentially a series of face to face interviews with key staff to understand how data is used within the organisation – who “gets it” who doesn’t, where the data comes from, who uses the data intelligently, is the business driven by data and so on. This process is a little like the EMI example I gave earlier on, if you will, except typically we interview around 30 -50 people in the organisation rather than 1m.Pretty quickly from this concentrated set of interviews you get a very good view of the data capability within the organisation. Rarely do we find companies on the 4 and 5 on the model. What tends to emerge is a picture of frustration where there are silos of good data but no Single Customer View – or web data and social media is being tracked and there are groups trying to make sense of this but it isn’t being integrated in anyway with existing data in the traditional data warehouse.We then send out a structured questionnaire which covers in detail more specifics about how data is used within the business – often this is tailored to the various functions within the company so that the marketing team will get one version to help us understand how data is used to provide customer and prospect analysis, other more technical questions are put to the IT team to understand how advance the company is in terms of data models, dictionaries, standards and so on. We also ask the executive team specific questions about how data is used to run the business and whether it’s more important than the brand. Many, harking back to what the Economist found believe it is more important than their brand but also realise that the business is struggling to make best use of it. In all there are over 200 questions that we can use to build up a clear picture of an organisations data capability.
What is useful about this approach to many executive teams we engage with is the fact that the output from the initial discovery phase covers not just the IT team and technology but also people, policy, process, compliance and measurement . As we heard earlier on, people are really key to the process and without the skills within an organisation who understand the business priorities and can use the data to address these then Big Data is pretty pointless.So we provide an overall picture as well as a very specific view for each attribute.
Looking specifically at the technology for a moment might show really good scores for platform standardisation, validation but perhaps the extent to which data analysis technologies are appropriately deployed is lacking as the data is too unstructured. For more advanced organisation there is the scope to run the model and examine scores relating to Big Data as well as traditional more structured data Still needs good metadata
Think about mapping all the various touch points for the data you’re trying to get to so you’re clear on the strategy for mining this data. Identify your “sacred data”, define the workflows that will use unstructured data and use data to redefine business processes following a highly iterative purpose.
The challenges of big data, how data capable is your business? DQM Group
The Challenges of Big Data
How Data Capable is your
Christine Andrews – DQM Group
What’s all the fuss about?
The challenges of Big Data
Who’s doing Big Data well
How can you tell how well you’re doing
How to make your organisation more “data capable”
Beware the EU
IBM and Sas Definitions
Variability and Complexity
What can you do that you couldn’t do
Recalculate entire risk portfolios
Mine customer data for new insights
Quickly (and I mean quickly) identify the customers
who matter most
Send tailored communication to mobile devices at
exactly the right time
Analyse data from new social media
Sas and Economist Survey
Aware of power of data but
not exploiting data collected
Need for defined data
Talent matters as much as
Growth in web tracking and
Social media data
The Data Maturity Curve
Has a mature set
Can address root Identifies and
fixes issues but
cause and stop
strategy in a
Can fix some they occur.
Conducts ongoing manner focussed
they arise but monitoring.
development is a
Based on the MIKE2.0 Methodology
Technology – Example
Extent to which data analysis
technologies are appropriately
Data Quality Metrics
Considers the inclusion of
automated data quality metrics
Common Data Services
Master Data Management
Dashboard (Tracking / Trending)
Data Quality Strategy
Data Quality Metrics
Map your data flows and touch points
Data Subject Area Coverage
Service Level Agreements
Data Quality Metrics
Big Data offers significant opportunities but
Think carefully about your current data capabilities
before embarking on new ones
Consider the people you need and the processes
you follow as well as the IT
Don’t forget compliance and any future EU
Thanks for Listening
Register free for DataIQ