Practical Analytics
White Paper
By: John Enoch (john_enoch@ymail.com)
Date: 1 February 2013
EXECUTIVE SUMMARY
§ Keep Big Data in view, but some healthy scepticism on the
expectations set by IT vendors is advisable...
Introduction
Much has been written about Big Data and the potential changes technology
will make to how strategy is formul...
About Big Data
i. Data Volumes, People and Technology
According to the recent IDC Digital Universe paper, the world's data...
ii. Governance
One very useful application for Big Data is governance, especially in
industries like Financial Services wh...
Problems with Big Data
i. Silos of Data
In a recent analysis from IBM and Oxford University, it was found that out of
over...
iii. Quality in Data and Results
Data quality is a significant issue. Finding high-quality data can become
expensive, time...
iv. Underlying Assumptions
Another issue with Big Data is that it is too often presented by big IT
vendors as being the an...
Employees are usually aware of systems and processes that waste their
time, cost the business and make it difficult to get...
ii. Analytics Applied to Mobile Phone Costs
Mobile phone costs are likely to be a major source of potential cost savings.
...
As an estimate of potential cost savings, a company with over 100 mobiles
and a fair degree of mobile roaming, should be a...
 	
  
For more information please contact: john_enoch@ymail.com
Upcoming SlideShare
Loading in...5
×

Practical analytics john enoch white paper

152

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
152
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Practical analytics john enoch white paper

  1. 1. Practical Analytics White Paper By: John Enoch (john_enoch@ymail.com) Date: 1 February 2013
  2. 2. EXECUTIVE SUMMARY § Keep Big Data in view, but some healthy scepticism on the expectations set by IT vendors is advisable. § First ask what data have you got and what can you usefully do with it? § Getting value from data is as much about people and skill sets as it is about technology. § Poor planning leads to complex tools chasing answers to the wrong questions. § Surely there are more useful applications for Big Data than mining social media sites and targeting on-line advertising? § Big Data has a lot to offer in risk management and strategic planning. § A lot of useful data lies in silos and is unstructured and unused. You need to know what you want and where to find it before deciding what to do with it. § As with any major project, a large scale analytics initiative needs executive backing and cross-functional consensus on method, ownership and outcomes. § Given the complexity and potential pit-falls of Big Data initiatives, perhaps it would be wisest to start small with more manageable data sets. § Why not start with some quick wins on something undeniably valuable, like costs savings, which will not need highly complex tools and project management resource? § High impact quick wins: travel costs and mobile phone expenses.
  3. 3. Introduction Much has been written about Big Data and the potential changes technology will make to how strategy is formulated and business decisions made. However, not all useful data is big. Across so many businesses, there is a pressing need to utilise data assets whilst they are still fresh and useful. Long established organisations in telecommunications, financial services and utilities have been handling large data sets that amount to Big Data for decades, although processed somewhat slower than is possible today. Newer businesses, such as social media and life sciences, are applying new analytics capabilities to vast and ever increasing volumes of data. Also, in our increasingly monitored society, evidentiary technologies such as video capture and call recording generate volumes of data and increasingly sophisticated analytical tools. Data analytics has become a core competence for businesses. Insights from data are now a key source of competitive advantage. Big Data comes into play when the data sets get too big for standard data mining tools to handle and where there is a need for a more real time and predictive approach. However, for most businesses, old fashioned tools such as relational databases are still widely in use and there is a familiarity with data analytics within most organisational skills sets. The need for large scale data analytics is not yet widely seen as a priority issue. Big Data is still usually seen as an emerging technology with a set of benefits to keep an eye on but more with a view to how it could be put to practical use in future. Whilst this development and learning around the use and capabilities of Big Data continues, it would be useful for businesses to apply their existing analytics skills to shorter term, practical issues that use smaller data sets. Businesses continually generate data and much of it gets lost or orphaned in silos when it could be usefully consolidated, structured and processed. There will always be a need for human intelligence, intuition and know-how in the use of data analytics. Decisions on strategy, cost reduction, supplier management, fraud, regulatory compliance and tax management all need a combination of human intuition and insights drawn from the hard facts of data analysis to know what to look for, where to find it and assess false positives. For these very defined issues, employees usually know how things could be improved or money saved. Since improvements in such areas generate a high impact in a short time with the least complexity, this paper advises businesses to consider focusing efforts on the smaller things that have a more assured return on investment before taking on complex Big Data initiatives.
  4. 4. About Big Data i. Data Volumes, People and Technology According to the recent IDC Digital Universe paper, the world's data is doubling every two years with 1.8 trillion gigabytes created in 2011. The volumes, speed and variety of data are rapidly changing and with that the tools with which data can be processed. Just like a newspaper, data has a shelf life and its value goes down over time. It is an asset that is created by people or machines taking actions and leaving traces of what they did. It grows exponentially and across the organisation, which is why it is so difficult to keep up with the information assets of today, let alone be ready to maximise value from the data sources of tomorrow. Making a data analytics initiative successful is as much about having the right people as it is having the right data and technologies. The challenge is to use huge data resources quickly enough to catch the value in the moment. A lot of focus and clarity on what the initiative is suppose to achieve is needed before any implementation project takes place. Otherwise the outcome of a project can end up as a system with inappropriate rules, chasing answers to the wrong sort of questions. Sadly this seems to happen quite often. To get things right there is a need for the right people at all stages of the project. If the skills to generate insight from data and act on it are not yet strong enough in-house, it would be prudent to look externally for suitable consultants who can bring a perspective based on wider experience. Data sets over 10TB are typically classified as “Big Data”. In the past, this has meant that they are so large that it has been difficult to process them with standard tools and in-house resource. For example, a global telecommunications company may collect billions of detailed call records per day from 120 different systems and store each for at least nine months. It used to be slow to run billing cycles and trend analysis to get insights from large data sets. Not any more. Processing bulk data has been made possible by technologies such as Apacheʼs Hadoop and the many proprietary alternative products such as EMC Greenplum, MapReduce, HP Vertica, IBM Smart Analytics/Netezza, Sybase IQ and Teradata. Also, storage costs have become a lot cheaper. These changes have finally made large scale data mining practical and affordable. It seems that these exciting new technologies are being mainly used for social media sites, shopping sites and the need for targeted advertising to sell more “stuff”. Surely powerful and game changing data analytic tools can be applied more usefully to improve the way businesses are run and have a positive impact on whole economies.
  5. 5. ii. Governance One very useful application for Big Data is governance, especially in industries like Financial Services where the problems are systemic and seemingly embedded in the culture of doing business. After the wave of Sarbannes-Oxley work in the early 2000s, many Internal Audit departments have been contracting. There is an opportunity for them to take on a new and more strategic role by using data analytics to improve agility around managing the enterprise risk profile. Engaging independent consultants can be very useful in getting past internal barriers and achieve the identification, collation and quality checking of key data sets. Specialist assistance can also be useful in helping to define the rules and algorithms that need to be applied in order to get the desired outcomes. Then there is a need for consensus on what truths the data is indicating and how processes and policies need to be changed to better address the evolving risk profile. Coordinating these sorts of projects with both in-house and external expertise leverages the existing skills of Internal Audit in new ways. Data analytic systems can generate outputs that will hopefully lead to more informed decision making. In order for enterprise risk management to be effective, the outputs need to be made readily available to various risk committees who can then evaluate risks, impact and likelihoods in a more fact based, less subjective way. A common problem is where companies try to apply a Business Process Re-Engineering (BPR) or “lean” approach to managing risk. Whilst the BPR/lean approach works well for automated systems such as factory floors, it does tend to eliminate creativity and flexibility. The input for decision points is not just the data itself, but also how human brains assess data, trends, gut feelings and come to consensus before decisions are made. When applying analytics to governance, a more knowledge-based process is needed. The value of all these new insights from data comes together at the decision points where one or more people can ask the simple question “why?” BPR/lean was really designed for managing a flow of actions and decisions made by machines, not by people. The risk function can become a source of business insight and consensus building when it has clarity on what is really going on. Big Data enables this and hopefully organisations with known and serious governance problems will give Big Data initiatives a high priority.
  6. 6. Problems with Big Data i. Silos of Data In a recent analysis from IBM and Oxford University, it was found that out of over 1,000 businesses surveyed, less than 30% had even only started to adopt Big Data. Out of those who had, less than 6% had actually got beyond the pilot in order to see actual real life Big Data initiatives come into use (see “IBM 2012 Analytics Study: The real-world use of big data”). Changing an organisation so that its data assets are kept fresh, consolidated and available for suitably skilled professionals to use is in itself a major project. One of the most time consuming parts is actually finding the data and extracting it from different systems, both inside and outside firewalls. There are also issues of security, privacy and data protection that need to be addressed. Getting data out of silos and into a structure and a format that makes it useable is also a challenge. Automated tools are now being used to create master data repositories across disparate enterprises and silos to quickly derive data and also combine various data sources so that they use consistent classes and properties. The challenge is to get quality data that takes a business away from silos and into a single version of the truth. This, in turn, can lead to higher data quality and faster deployments of synchronised Business Intelligence scorecards and dashboards across organisations. ii. Data Structure Another issue is that data usually comes in multiple types and formats. To make it even more difficult, much of the explosion in data volumes can be attributed to large amounts of "unstructured" data. This does not fit into a traditional data model (i.e. database, data warehouse, etc.) but is represented by text, server logs and web based data. To translate these into an analytics framework, the trend is to develop hybrid data structures that leverage both traditional Business Intelligence tools for structured data (e.g. relational databases), in addition to new technologies that allow for streaming very large quantities of unstructured data in ways that enable businesses to data mine the information for analytical purposes. Businesses often seek to convert data into a more structured format to augment existing Business Intelligence and analytic methods. Much of the hype around "Big Data" is in reference to this approach and as with all hype there will be both an element of truth and possibly a lot of unfulfilled expectations.
  7. 7. iii. Quality in Data and Results Data quality is a significant issue. Finding high-quality data can become expensive, time-consuming and frustrating. There is never going to be a perfect data set, so at some point someone needs to decide whether or not a data set is good enough to use. There will always be a need for validation checks to identify duplicates, outliers or orphan records. Practicality comes with the experience and know-how that says when to draw the line and accept or reject a data set. When data analysis is completed, the main issue is false positives which can lead to wrong conclusions and bad decisions. Knowing when a set of data is making sense or not is a people skill and requires knowledge of both the data and the context in which it is being used. iv. People & Terminology Since Big Data initiatives are so complex, they need a lot of consensus and commitment from management in order to be successful. As with all initiatives, people are going to be the ultimate drivers of success or failure. A lack of proper executive sponsorship is probably the leading problem for developing and sustaining practical use of analytics in any business. It can be useful to get external advisors to facilitate workshops in order to establish priorities and write up action plans, which removes the potential bias of politics and dominant personalities within organisations. A lack of consensus around specifications, objectives and definitions at the working level is probably the biggest cause of project failure. When definitions of requirements are not aligned with business strategy or a clear list of required outcomes, the end result will ultimately not be right for the business. The skill sets and expertise needed for data analytics are not the same as the traditional IT skill sets. This can lead to potential internal conflicts and politics between the people who define the specification and algorithms and those who need to assemble the systems to make things work. As with any conflict, there is value in having a mediator; a neutral third party who can facilitate consensus around key issues to prevent people problems from derailing projects. This sort of external consulting and facilitation can lead to the creation of analytics competency centres, closely involving Internal Audit departments, in a structure that sits between IT and the business users. Analytics is too embedded with business context and process to be managed and controlled as constrictively as most traditional IT applications. Organisations that can find a balance of IT control (standard architecture) and business definition and use (governance) would probably benefit the most from Big Data analytics.
  8. 8. iv. Underlying Assumptions Another issue with Big Data is that it is too often presented by big IT vendors as being the answer to everything. Big Data is not the beginning and the end of all forms of data analysis. It is a far higher level of capability and complexity over and above more established data mining tools. As with most data mining, there is the underlying assumption that historical data can predict future results to a high level of accuracy. Sometimes if not often this is true. However, exceptions and coincidences should be expected. If there is a change to the environment that generates past results, then that data set and its algorithms are unlikely to be valid for further predictive analytic tasks. More and complex algorithms could be put in place to anticipate change, but their success will ultimately come down to the assumptions upon which they are based. There is always the risk that an analytical system will no longer represent its data set. Big Data is without doubt a phenomenon. Equally certain is that we are in the very early stages of its evolution. So rather than getting too excited by the promise of Big Data as evangelised by big vendors, it is probably best to get clarity and consensus on what each Big Data initiative is supposed to achieve and at what point it would justify the time, resource and expense required. Then plan the project, investigate the technology and define any need for highly skilled data scientists. The early stage of any large scale data analysis project is best spent focusing on the people who will have to use the outputs of Big Data and the people putting in place the infrastructure to enable it. For this reason it is advisable to bring in a neutral third party to facilitate consensus and the action plans. Back to Business Intelligence i. Smaller Scale Data Analytics Projects Whilst Big Data generates a lot of media attention, organisations that still continue to mine various data sets often do so with clear goals and a successful outcome. These smaller projects are usually referred to as “Business Intelligence” or “Business Analytics”, with the two terms being used interchangeably. For the sake of simplicity, we refer to Business Analytics as having bigger data sets and a lot more complexity in its algorithms. What we here refer to as Business Intelligence (BI) uses less powerful tools, usually with in-house skill sets and far more familiar data mining techniques. The BI approach is very powerful when applied to smaller, targeted issues where insights from data analysis can lead to recommendations and quickly yield a positive return. Processes are similar to habits, they evolve over time and can be hard to change. However, if something has always been done in a certain way does not automatically mean it ever made sense.
  9. 9. Employees are usually aware of systems and processes that waste their time, cost the business and make it difficult to get their jobs done. They can also come up with good ideas for improving things given access to the right data analysis, which BI is well suited to deliver. Supplier management is an example of a high yield place to start when asking what common tasks and activities waste time, cost money and could be improved through better insights from data. The data sets used in such work are typically quite easy to find: supplier contracts, billing data and inventories. Linking these together in relational databases makes it possible to query, analyse trends, find anomalies and identify opportunities to save money or claim rebates. This sort of project is typically small but can have a big impact. ii. Data Analytics Applied to Travel Costs To illustrate the value of old fashioned BI projects to businesses, even with all this focus on Big Data, look at travel costs. Travel costs present a good example of where data analytics can answer questions that lead to decisions that in turn save money. Most large companies will likely have a centralised booking system for flights and hotels provided by one or more travel agents. These travel suppliers have probably won their contracts based on volume related discounting and contractually defined service levels. What questions can be asked of all the historical booking data to find potential cost savings? The first questions that come to mind would be: 1. Are all travel agents charging correctly based on volumes or are rebates due? 2. Is one travel agent being underused, thereby missing the opportunity to qualify for better discounted pricing? 3. Are the selections of hotels and flights being made on the right criteria or are additional inputs required such as user ratings, location etc? Once the contract terms are coded as rules through which all the booking data can be viewed, it becomes relatively straightforward to create a continuous audit. Perhaps other data sets need to be added to the system over time such as user reviews, safety records etc. Once travel has been addressed and the opportunity for cost savings proven, other cost areas also present themselves for the same treatment.
  10. 10. ii. Analytics Applied to Mobile Phone Costs Mobile phone costs are likely to be a major source of potential cost savings. The supplier contracts are complex, there is a lack of regulatory insight over roaming, billing has errors and data hungry applications are becoming more and more common. In addition, employees can see their mobile phone as a perk and there are often lax attitudes toward what would be reasonable and responsible usage. Also, the technology is fast changing and the IT industry keeps coming up with new products and solutions, some of which live up to the hype. There has been a lot of industry hype recently around Bring Your Own Device (BYOD). This is rarely defined and it could apply to the employee buying their own mobile device and paying their own costs or it could also be about the security and partitioning of mobile devices. However, for most businesses, the possible productivity benefits of BYOD apply usefully to laptops and tablets rather than mobiles. For mobile management, BYOD has again been presented by the IT industry as an answer to everything and taken too much attention away from real solutions to tangible problems. Both the payments to suppliers and the expense management of employees present opportunities to save time and money. As well as managing mobile supplier costs, there is also the issue of how employees claim their expenses for mobiles and separate their personal usage from company costs. What many companies have been doing for years is getting employees to go through their mobile phone bills with a yellow marker pen to manually separate out the business from the personal calls. This is time consuming and not necessarily accurate. However, it could be seen as a simple BYOD policy. The core data sets are the mobile carrier contracts, the mobile carrier billing data and device inventories. Also, there are probably additional data sets that could be included such as data and system logs from the mobile devices, applications and any mobile device management systems (e.g. BlackBerry Enterprise Server). More and more data sets can be included as more questions need answering. Perhaps the most obvious questions that a business would want its mobile data to answer are: 1. How can technology simplify and streamline the mobile expense management process and claim back hours of time employees spend with yellow marker pens, bills and forms? 2. Are contract rates actually being applied or are rebates due? 3. Are employees using their mobiles in a fair and cost efficient way? As with the example of travel costs above, the technology tool would be a BI database structure into which the different data sets are absorbed and a graphical user interface developed. This GUI would make it easy to spot trends, flag risks and extract data for external analysis and reporting.
  11. 11. As an estimate of potential cost savings, a company with over 100 mobiles and a fair degree of mobile roaming, should be able to use data analytics tools to reduce current costs by up to 20%. In both these examples of travel and mobile phone costs, the typical problems of Big Data do not really apply. They use smaller data sets, simpler tools and much of the expertise in configuring and managing the projects is in-house. The value of external expertise would be in developing regular cost savings reports, facilitating agreement on implementation of recommendations and drawing on a wider experience base of where to find cost savings. Conclusion Big Data will have a crucial part to play in gaining competitive advantage but in order to make it work as a useful system and be able to action its findings, it needs the right people with the right skill sets. Since the level of complexity around Big Data is so high, we recommend putting more focus on old fashioned BI and applying these familiar tools to old problems. Big Data is the next big step in data analytics, but it is not the answer to everything.
  12. 12.     For more information please contact: john_enoch@ymail.com

×