BIG DATA & BUSINESS ANALYTICS- the need,
applications, challenges, new trends and
a consulting perspective
(Why is Big Data a strategic need for optimization of organizational processes especially
in the business domains and what is the consultant’s role?)
Vikram Joshi
With every transaction and activity, organizations churn out data. This process happens
even in the case of idle operation. Hence, data needs to be effectively analyzed to manage
all processes better. Data can be used to make sense of the current situation and predict
outcomes. It also can be used to optimize business processes and operations. This is
easier said than done as data is being produced at an unprecedented rate, huge volumes
and a high degree of variety. For the outcome of the data analysis to be relevant, all the
data sets must be factored in to the analysis and predictions. This is where big data
analysis comes in with its sophisticated tools that are also now easy on the pocket if one
prefers the open source.
The future of high potential marketing lead generation would be based on big data.
Virtually every business vertical can benefit from big data initiatives. Even those without
deep pockets can use the cloud model for business analytics/big data analysis.
Some challenges remain to be addressed to engender large scale adoption but the current
benefits outweigh the concerns.
India has seen a massive growth in big data adoption and the trend will grow though it is
generally amongst the bigger players. As quality of data improves and customer
reluctance to being honest when they volunteer data reduces, the forecasts will become
more accurate and Big Data will have come to its rightful place as a key enabler.
WHAT IS BIG DATA, BUSINESS ANALYTICS
“Big data1 is a popular term used to describe the exponential growth and availability of
data, both structured and unstructured. And big data is as important to business and
1
www.sas.com
society as the Internet has become as more data may lead to more accurate analyses and
hence better decisions”: as per SAS, one of the global market leaders in this segment.
Analytics2
is the methodology that uses statistical & operations research (OR)
techniques, IT (Information Technology) software to make sense of trends, visualize
them and to predict outcomes with a high degree of accuracy. There are 3 types of
analytics:
1. Descriptive analytics: The use of simple statistical techniques to describe what is
contained in a data set/database e.g. a bar chart to depict number of shoppers
segmented by tiered levels of purchases (small/medium/high value). Techniques
include measures of central tendency, measures of dispersion, charts, graphs,
sorting methods, frequency distributions, probability distributions, and sampling
methods.
2. Predictive analytics: It involves the use of advanced statistical, OR (Operations
Research) and IT (information Technology) software to identify variables whose
behavior can be predicted and to build models to identify/predict trends not
observed in descriptive analysis e.g. using techniques like multiple regression to
identify the presence or absence of a relationship between selected variables that
can help explain/predict their behavior due to interdependence if any. You may
use statistical methods like multiple regression and ANOVA (Analysis of
Variance), information system methods like data mining and sorting, operations
research methods like forecasting models.
3. Prescriptive analytics: It is the use of decision science, management science and
applied mathematical techniques to help make optimal use of allocated resources
e.g. linear programming models may be used to optimally allocate budgets for a
big retail store that wants to get an optimal amount of customers with a limited
campaign budget. Operations research methodologies like linear programming
and decision theory.
Business Analytics follows the same methodology as plain analytics but with the
caveat that the outcome of the analytic analysis must make a clear and measurable
impact on business performance. It reports results like business intelligence but in
addition also explains that why the results have occurred apart from just the storage
and reporting. (Refer Figure 1)
Figure 1: Business Analytics Process [Business Analytics: Principles, Concepts &
Applications by Schniederjans & Starkey (Pearson)]
2
Schniederjans & Starkey, 2015, “Business Analytics: Principles, Concepts & Applications”, Pearson,
USA
Business Intelligence is a set of processes that convert data into useful information for
business utility. Some believe that it covers the fields of information systems, analytics
and business analytics as an umbrella of offerings while others believe that it is
concerned with the collection, storage, retrieval and exploration of large chunks of data to
enable better decision making and planning in almost all spheres of organizational
activity.
WHY IS BIG DATA NEEDED
The Oct 2012 issue of Harvard Business Review proclaimed Big Data as the new
management revolution and cites the example of the Amazon who used predictive
algorithms to predict customer buying and effectively put many others in the same niche
out of business. Since decision making based on Big Data is based on facts it does not
just depend on HiPPO- the highest-paid person’s opinion. This makes the quality of
decisions better and more appropriate. 3
Organizational performance has always churned out large chunks of data that require
silos for filing and methodologies to make sense of it. The availability of lower-cost
hardware, software and support makes it more feasible to retrieve and process
information, quickly and at lower costs than ever before.
The world is moving from ‘Traditional analytics’ to ‘Predictive analytics’ and now
increasingly towards ‘Prescriptive analytics’ (where the decisions are driven by
3
McAfee & Brynjolfsson, Oct 2012 issue, “Big Data: the management revolution” , Harvard Business
Review, < https://hbr.org/2012/10/big-data-the-management-revolution/ar>
predictive models using business rules engines to help the companies to decide the “next
best action”).4
Analytics is the natural result of four major global trends: Moore’s Law (which says that
technology always gets cheaper), and the three other components of SMAC (Social
media, mobile, analytics, cloud) - social media, mobile technology and cloud computing
options.
Traditional data management and analytics software and hardware technologies, open-
source technology, and commodity hardware are also merging to create new alternatives
for IT and business executives to take advantage of the next generation of analytics.
This has been due to the exponential increase in the 3Vs of data- volume, velocity and
variety. This is due to wide range of data sources from primary/secondary research,
location data, image data, process data, supply chain data, ERP(Enterprise Resource
Planning)/MIS (Management Information System) data, BYOD (Bring Your Own
Device) data etc that all have a say in business outcome.
Big data analysis/ business analytics is being used in almost all spheres of activity to:
1. Improve all operational efficiencies by reducing waste and better optimization of
resource usage;
2. Increase revenue by better forecasting and predicting customer behavior; and
3. Achieving competitive differentiation by better data driven decisions.
Predictive analytics has been used or is being looked at to discover new resources-
natural/energy, predict consumer credit scores, assess health risks, detect fraud, predict
disasters, target high potential leads and to create moments of customer delight by giving
the customer what he/she wants before they know it.
Over half of the 1,217 global firms surveyed by TCS (Tata Consultancy Services), had
undertaken Big Data initiatives in 2012, and of those 643 companies, 43% predicted a
return on investment (ROI) of more than 25%. The median spending on Big Data by
Indian companies is expected to rise from the current $9.5 million to $12.5 million by the
end of 2015.5
Leading organizations are not just integrating data into their analysis and decision making
but also using it to design more effective products and services. (Refer Figure 2)
4
“The SMAC Code” report by KPMG & CII, 2013,
<https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/The-SMAC-code-
Embracing-new-technologies-for-future-business.pdf >
5
“The Emerging Big Returns on Big Data” report by TCS, 2013,
<http://www.tcs.com/SiteCollectionDocuments/Trends_Study/TCS-Big-Data-Global-Trend-Study-
2013.pdf >
Figure 2: Global Business Intelligence Market Size and Growth Y-OY by
Technologies
Figure 2: Global Intelligence/BI market (http://www.cloudcomputing-news.net/news/2014/jul/10/
roundup-of-analytics-big-data-business-intelligence-forecasts-and-market-estimates-2014/)
VERTICAL WISE IMPLEMENTATION EXAMPLES6
RETAIL
Big data methodologies can greatly improve marketing, merchandising, operations,
supply chain and after-sales service by offering better insights. They can be used in
inventory management, promotional analysis, store operations etc.
The US-based book retailer, Barnes & Noble used a big data analytics solution to enable
suppliers to monitor its inventory and take real-time replenishment decisions.
Big data can also be used to better understand the target market, measure & understand
consumer behavior, understand the preferences of potential customers and hence design
better offerings and campaigns.
HEALTHCARE
There is a challenge of managing large amounts of unstructured data and faces a serious
three pronged challenge in terms of volume, variety and velocity (high rates of
generation). The main applications are preventive healthcare, drug discovery and
electronic health records.
Some of the largest integrated delivery networks in the USA such as Cleveland Clinic,
MedStar, University Hospitals, St. Joseph Health System, Catholic Health Partners and
Summa Health System have successfully been using the big data platform for real-time
exploration, performance and predictive analytics of clinical data.
TELECOM
Telecommunications involves trillions of small transactions on a daily basis that offer
insights in their own right.
Globally, telecom operators are deploying big data tools to make sense of the silos of data
for some years now. This has helped operators to improve quality of service by
increasing customer satisfaction in shorter times, and thereby making the business more
profitable. The emergence of cloud-based open source platforms, coupled with big data
has also resulted in faster processing and analysis of data with economical costs.
Bharti Airtel, the largest operator in India, uses analytics to enable the marketing
department to deliver targeted campaigns to high potential customers on a daily basis.
The analytics function processes more than five billion transactions daily, contributing
significantly to the top line.
GOVERNMENT/PUBLIC SECTOR
6
“Six Converging Technology Trends” Report by KPMG & NASSCOM, 2013,
<https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/Six-Converging-tech-
trends.pdf >
The sheer volume, variety and velocity of data in this vertical make data management an
onerous task. More time is spent on collection than in analysis which is
counterproductive in a way. If analyzed in the right manner, this data has the potential to
create better decisions and address some of the biggest challenges that plague them.
To make the most of this data flood, governments across the world are now building
significant big data roadmaps. Big data now has a significant role to play in the delivery
of public services, defense & security, managing transportation and logistics, science,
R&D (Research & Development) etc.
The Indian government has been making extensive use of big data tools to power its
Aadhaar project. To capture and process such significant volumes of data, UIDAI
(Unique Identification Authority of India) runs three duplication servers powered by
MySQL and Hadoop. It has become the world’s largest biometric ID system and is used
for ID proof, subsidy delivery, opening a bank account and other essential services and is
accepted throughout India.
FINANCIAL SERVICES
The financial services industry is amongst the most data driven industry verticals. They
have to store and analyze several years of transaction data as per laws. Electronic trading
means that capital markets firms generate billions of messages every day. The increasing
proliferation of social media is driving banks to keep track of their customers on these
platforms. Estimates put the figure at over 80 percent of the data in this industry is
unstructured. They are other uses for big data like fraud detection, credit worthiness,
cross selling, customer retention etc.
Santander Bank in Spain sends out weekly lists of customers who it thinks may be
attracted to particular offers from the bank, such as insurance, to its branches.
In Singapore Citigroup keeps an eye on customers' credit card transactions for
opportunities to recommend them discounts in restaurants nearby based on their tastes
and preferences. That sometimes gave the bank an additional second transaction.
By using big data on huge volumes of complex business data directly in the data
warehouse, the Chinese bank CITIC was able to uncover useful information &
predictions. It was able to use data across a dedicated credit card centre setup especially
for leveraging customer insights across the other bank departments.
TRAVEL/TRANSPORTATION7
Big data has uses and has found application in better traffic control, smarter roads,
intelligent cars etc.
7
“The SMAC Code” report by KPMG & CII, 2013,
<https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/The-SMAC-code-
Embracing-new-technologies-for-future-business.pdf >
Red Bus- an Indian online travel firm, successfully used Google’s BigQuery to analyze
booking and inventory data to create business advantage across their hundreds of bus
operators serving more than 10,000 routes. The company chose BigQuery over Hadoop
servers, which required more set up time as well as higher operating costs.
OTHER EXAMPLES OF IMPLEMENTATION8
 Google used big data to predict the next wave of influenza
 IBM used data to optimize traffic flow in the city of Stockholm, and to get the
best possible levels of air quality
 Dr. Jeffrey Brenner, a physician in New Jersey, used medical billing data to map
out hot spots where he found his city’s most complex and costly healthcare cases
as part of a program to lower healthcare costs
 The National Center for Academic Transformation used data mining to help
understand which college students are more likely to succeed in which courses
 CBG Health Research, a public-sector research organization in New Zealand,
created the HealthStat research tool, which enables primary health organizations
to identify trends—such as flu or gastroenteritis outbreaks—in real time. Also in
New Zealand, the Ministry of Social Development is using data to design targeted
programs for at-risk populations.9
CHALLENGES
There are many challenges to the successful implementation of a big data initiative:
 The reluctance by potential customers to share data and be studied is a concern in
India
 The quality of the data that is being used to draw insights is questionable
sometimes
 Privacy and Security concerns and issues
 Data migration and portability issues
 Lack of trained manpower of an acceptable quality
 High levels of organizational inertia to change
 Challenges of data visualization- there is a need for powerful visualization in
structured dashboards that is improving as we speak
 Scalability for size, speed and complexity issues needs to be managed
 Data relevance, correlation and connectivity issues have to be taken care of
8
“The Global Information Technology Report 2014” by WEF in collaboration with INSEAD & Cornell
University, < http://www3.weforum.org/docs/WEF_GlobalInformationTechnology_Report_2014.pdf >
9
“The Global Information Technology Report 2015” by WEF in collaboration with INSEAD & Cornell
University, < http://www3.weforum.org/docs/WEF_Global_IT_Report_2015.pdf >
 Ensuring human user interface stays in the loop especially in macro decisions
 Personalization & Customization to user needs
 As per the Network Readiness Index (NRI) 2015 in the GITR 2015 (Global
Information Technology Report) by WEF, India was 89 out of 148 nations
surveyed. It was behind smaller neighbors like Bhutan, Sri Lanka, Mongolia,
Thailand etc. The NRI benchmarks the ICT readiness and usage of their
economies.10
 There are difficulties to tackle unstructured data in various sectors like telecom,
retail and banking
 Even leveraging open source like Hadoop has hidden costs like hiring/training
costs, need to upgrade to enterprise versions and enterprise level servers that are
paid may be required for mission critical solutions.
 The massive data generation by the Internet of Things/ Internet of Everything is a
huge challenge
 Cognitive analytics will automate analytics in the future through bionic brains that
will get smarter with each iteration but the question is whether they will replace
the human element or not11
 Machine learning will guide automation
 Inaccuracy in forecasting still remains a big concern for big data analytics
 Smartphone access to big data will be critical in the future as they are fastest
adopted technology trend ever are the computing units of the future. Internet
Protocol (IP) enabled sensors like RFID (Radio Frequency Identification) tagging
is inexpensive and will be extensively used instead of manual data punching12
 Big Data results on the bottom line take time to show: there are few immediate
results
WHAT IT MEANS FOR INDIA
India’s Big Data market is set to touch $ 1bn in 2015 with a CAGR (Compound Annual
Growth Rate) of almost twice the global CAGR for this space as per industry estimates.
Key gainers will be the BPO (Business Process Outsourcing) sector and those offering
data driven decision making solutions. The customized education sector will also see a
major impact that will match study majors to the student’s unique interests and skill sets.
There is vast scope of the applicability of Big Data in e-governance in all its spheres:
government to citizens, government, employees, and business. Many projects in rural and
10
“The Global Information Technology Report 2015” by WEF in collaboration with INSEAD & Cornell
University, < http://www3.weforum.org/docs/WEF_Global_IT_Report_2015.pdf >
11
Thor Olavsrud, 9 Feb 2015, “ 8 Analytics Trends to watch in 2015”, CIO,
<http://www.cio.com/article/2881201/data-analytics/8-analytics-trends-to-watch-in-2015.html >
12
Philip Evans, 7 May 2015, “ 5 Key figures about Big Data”, OpenMind,
<https://www.bbvaopenmind.com/en/5-data-about-big-
data/?utm_source=twitter&utm_medium=techreview&utm_campaign=MITcompany&utm_content=5bigda
ta >
urban areas in India are underway. Presently over 1000 e-governance services are
operational in India as a part of NeGP (National e-Governance Plan).13
With the Digital India program announced by the Indian Prime Minister Mr Narendra
Modi, big data could provide the brain for the digital engine. These could be the
backbone for e-governance & services on demand by citizens and for digitally
empowering them. The vast amounts of data collected could only by made sense of by
big data initiatives.
NEW APPLICATIONS OF BIG DATA14
Drug discovery could be accelerated with machine learning using Big Medical Data
which could result in savings of years of research time. This could be due to the use of
computer model simulations and the Radins have done just that with a computer
algorithm called twoXAR.15
Sports training and personalized medicine are touted as the next big thing in Big Data.
These would enhance athletic performance and general health of people. Using tracking
techniques, sport geniuses are being studied to help raise the level of performance of
other athletes by breaking it down to data sets that are measurable.
THE CLOUD & BIG DATA16
Businesses of any can unlock the potential of big data in cloud environments and derive
significant competitive advantage.
Cloud delivery models offer exceptional flexibility and economic advantages that enable
the big data adopter to easily launch and build on it. Private clouds can offer a more
efficient and economical model to implement big data analysis in-house, along with the
public cloud options. This hybrid cloud option enables companies to use on-demand
storage space and computing power via public cloud services for certain short term
projects and provide added capacity and scale on demand. Privacy and data security
concerns will drive the adoption of the appropriate model with most of the sensitive
information hosted on a private cloud.
13
Kalbande, Deshpande & Popat, March 2015, “Review paper on use of Big Data in e-Governance of
India”, International Journal for Research in Emerging Science and Technology, Volume2, Special Issue
1,< http://ijrest.net/downloads/volume-2/special-issue-1/pid-m15ug642.pdf >
14
www.datanami.com
15
http://www.twoxar.com/
16
“ Big Data in the cloud: converging technologies”, April 2015, Intel IT Center,
<http://www.intel.in/content/dam/www/public/us/en/documents/product-briefs/big-data-cloud-
technologies-brief.pdf >
The full range of services is available with cloud-based AaaS (Analytics as a Service)
from data delivery and management to data usage with all the key capabilities that are
required from big data analysis/business analytics.
WHAT ABOUT HADOOP17
Hadoop is an open-source software framework based on Java developed by Apache
Software Foundation for storing data and running applications on clusters of commodity
hardware. It provides massive storage for any kind of data, enormous processing power
and the ability to handle virtually limitless concurrent tasks or jobs. Hadoop is one of the
most searched terms in Google Trends in big data analysis currently and is one of the
most popular frameworks used for Big Data.
Enterprise level versions are also available now on payment.
Its key benefits are:
 Distributed computing model with quick processing
 High degrees of flexibility and preprocessing of data is not required and storage
includes unstructured data like text, videos, pictures, etc.
 Data and application processing is protected against hardware failure
 Low cost: It is free and uses commodity hardware
 Scalability is easy
Hadoop came into the mainstream of the business world in 2010 and the product lifecycle
may be leaving the maturation phase into a possible decline. However, it is preferred for
unstructured data analysis, predictive customer analytics, sentiment analysis etc. 18
Its biggest issues seem to be:
 Widely acknowledged talent short fall
 There are some data security concerns that need to be addressed in the open
source version
 Tools for data quality and standardization are lacking
 Programming via the tool MapReduce does not resolve all problems especially
those for iterative and interactive analytic tasks
 Full feature tools for data management, data cleansing, governance and metadata
are not very user friendly
17
http://www.sas.com/en_us/insights/big-data/hadoop.html
18
James Kobielus, 3 March 2015, “ Hadoop is probably as mature as it’s going to get”, InfoWorld- Extreme
Analytics, < http://www.infoworld.com/article/2891832/big-data/hadoop-is-probably-as-mature-as-its-
going-to-get.html >
Google’s Cloud Dataflow and Apache Spark are seen by some as threats to Apache
Hadoop though it is yet to happen and Hadoop continues to hold its own.
AUTOMATION AND BIG DATA
IBM Watson Analytics offers smart data discovery options without the complexity
hassles generally associated with Big Data. It is a smart cloud based data discovery
service that does data exploration, automates predictive analytics and enables easy
dashboard and info graphic creation. All of this is automated and completed in minutes. It
takes away the complex mathematics and coding for analytics and does it for you. There
are three editions namely free, personal and professional editions. The free edition can
have upto 100,000 rows per data set, 50 columns per data set, included storage is 500mb,
and it can upload delimited and excel files. 19
To stay competitive as data continues to grow exponentially, one would need Artificial
Intelligence (AI) to derive maximum value from the data within the short time deadlines.
This is where machine learning or what is popularly called artificial intelligence comes
in. It will add an intelligence layer to big data to handle large numbers of complex
analytical tasks much faster, without human intervention and with lesser margins of error
than humans could ever hope to. AI can perceive its environment and know what to do.
It is capable of ‘deep learning’ that is instead of humans telling them what to do; they
will figure it out themselves based on the data.20
CONSULTING IN BIG DATA
There is a great potential for business opportunities in the big data space that revolve
around consulting advice on software and hardware issues.
IDC (International Data Corporation) has forecast that the global big data technology and
services market will grow at a CAGR of 27% to over $32 billion through 2017. That's six
times the growth rate of the overall ICT (Information and Communications Technology)
market. Systems integrators, VARs (Value Added Resellers), and service providers are
poised to take advantage of this market with big data consulting services and hardware
resale.21
19
http://www.ibm.com/analytics/watson-analytics
20
Maycotte, 16 Dec 2014, “ Why Big Data and AI need each other- and you need them both”, Forbes, <
http://www.forbes.com/sites/homaycotte/2014/12/16/why-big-data-and-ai-need-each-other-and-
you-need-them-both/ >
21
Crystal Bedell, Dec 2013, “Big Data Consulting services top analytics opportunities for channel” ,
TechTarget, < http://searchitchannel.techtarget.com/feature/Big-data-consulting-services-top-analytics-
opportunities-for-channel >
There are also opportunities in strategic consulting in Big Data as more organizations
take a strategic perspective to it. All major IT channel companies in this space have taken
the approach of a solution provider to provide optimal business value to clients rather
than just being product/service sellers. They are using a one stop shop perspective for the
complete project execution including strategic advice through a consulting model.
Consultants must take a problem solving approach to Big Data with a strategic
perspective. The client needs to select the problem that needs fixing and the consultant
needs to design the Big Data solution to resolve the problem. This is the right way rather
than being enamored by a tool/technology and then looking for the problem that it can
solve.
Competitive insights into what competitors and market leaders are doing help bring a
good perspective to technology selection. This information may not be available to the
client and the consultant may be of assistance. The consultant firm’s combined expertise
and experience in Big Data especially in a relevant business domain will be of immense
strategic value to a client in decision making.
The consultant can help the client is selecting appropriate technologies/ methodologies
and hardware for procurement, installation, training and maintenance. They could offer
the optimal solution that takes into account existing software and hardware infrastructure.
Competitive procurement analysis and execution can result in a lot of savings for a client.
Consultants can guide the client organization through the learning curve of the
technology as they come to grips with it. Data migration, portability and integration are
all concerns that a consultant can address.
The intellectual and experiential capital of the consultant can make all the difference for
the execution and maintenance of a Big Data project.
PUTTING IT ALL TOGETHER
The lure of big data cannot be ignored. It can be useful for virtually any sphere of
organizational activity and cost effective models like the open source and cloud hosting
do not offer onerous entry barriers and make its adoption very attractive. The biggest
problem seems to be the quality of data that is questionable and leads to inaccurate
predictions and analysis. Public relations exercises must be undertaken to educate the
customer in terms of assuring him of his/her privacy and the provision of complete data
security. Once the data reflects the actual situation correctly, the business analytics
forecasts would be more on target.
Rather than an all-out adoption approach even for those with deep pockets, a strategic
approach would be a better for the bottom-line. One of the most attractive new segments
would be marketing analytics that would predict customer buying behavior with a high
degree of accuracy. Manufacturing and supply chain management departments are also
users of big data analysis to better manage their processes and optimize them. Some of
the other applications have already been covered earlier.
The only downside and major concerns are the loss of human jobs but they can be
redeployed more effectively and humans would still be required till machines with
artificial intelligence take over business analytics completely.
The Indian Big Data market is slated for a heavy double digit growth figure and with
India going digital, big data analysis techniques would be required to make sense of the
huge mountains of data.
Mainstreaming of Big Data has not yet happened and is unlikely to happen as it is a
specialized domain. Many understand the advantages of Big Data but have the
misconception that costs are prohibitive and are clueless about how to do it. They do not
undertake the project because they feel that they are not geared for it. This is where the
specialist/consultant is of immense value. Part of the work of his team would include
awareness building and customer education to create interested leads and also to address
customer concerns.
Tracking of appropriate metrics and ROI (Return on investment) monitoring is critical as
each Big Data project must be outcome based. With the open source technologies
available, high spending power is not required for Big Data adoption.

BIG DATA & BUSINESS ANALYTICS

  • 1.
    BIG DATA &BUSINESS ANALYTICS- the need, applications, challenges, new trends and a consulting perspective (Why is Big Data a strategic need for optimization of organizational processes especially in the business domains and what is the consultant’s role?) Vikram Joshi With every transaction and activity, organizations churn out data. This process happens even in the case of idle operation. Hence, data needs to be effectively analyzed to manage all processes better. Data can be used to make sense of the current situation and predict outcomes. It also can be used to optimize business processes and operations. This is easier said than done as data is being produced at an unprecedented rate, huge volumes and a high degree of variety. For the outcome of the data analysis to be relevant, all the data sets must be factored in to the analysis and predictions. This is where big data analysis comes in with its sophisticated tools that are also now easy on the pocket if one prefers the open source. The future of high potential marketing lead generation would be based on big data. Virtually every business vertical can benefit from big data initiatives. Even those without deep pockets can use the cloud model for business analytics/big data analysis. Some challenges remain to be addressed to engender large scale adoption but the current benefits outweigh the concerns. India has seen a massive growth in big data adoption and the trend will grow though it is generally amongst the bigger players. As quality of data improves and customer reluctance to being honest when they volunteer data reduces, the forecasts will become more accurate and Big Data will have come to its rightful place as a key enabler. WHAT IS BIG DATA, BUSINESS ANALYTICS “Big data1 is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data is as important to business and 1 www.sas.com
  • 2.
    society as theInternet has become as more data may lead to more accurate analyses and hence better decisions”: as per SAS, one of the global market leaders in this segment. Analytics2 is the methodology that uses statistical & operations research (OR) techniques, IT (Information Technology) software to make sense of trends, visualize them and to predict outcomes with a high degree of accuracy. There are 3 types of analytics: 1. Descriptive analytics: The use of simple statistical techniques to describe what is contained in a data set/database e.g. a bar chart to depict number of shoppers segmented by tiered levels of purchases (small/medium/high value). Techniques include measures of central tendency, measures of dispersion, charts, graphs, sorting methods, frequency distributions, probability distributions, and sampling methods. 2. Predictive analytics: It involves the use of advanced statistical, OR (Operations Research) and IT (information Technology) software to identify variables whose behavior can be predicted and to build models to identify/predict trends not observed in descriptive analysis e.g. using techniques like multiple regression to identify the presence or absence of a relationship between selected variables that can help explain/predict their behavior due to interdependence if any. You may use statistical methods like multiple regression and ANOVA (Analysis of Variance), information system methods like data mining and sorting, operations research methods like forecasting models. 3. Prescriptive analytics: It is the use of decision science, management science and applied mathematical techniques to help make optimal use of allocated resources e.g. linear programming models may be used to optimally allocate budgets for a big retail store that wants to get an optimal amount of customers with a limited campaign budget. Operations research methodologies like linear programming and decision theory. Business Analytics follows the same methodology as plain analytics but with the caveat that the outcome of the analytic analysis must make a clear and measurable impact on business performance. It reports results like business intelligence but in addition also explains that why the results have occurred apart from just the storage and reporting. (Refer Figure 1) Figure 1: Business Analytics Process [Business Analytics: Principles, Concepts & Applications by Schniederjans & Starkey (Pearson)] 2 Schniederjans & Starkey, 2015, “Business Analytics: Principles, Concepts & Applications”, Pearson, USA
  • 3.
    Business Intelligence isa set of processes that convert data into useful information for business utility. Some believe that it covers the fields of information systems, analytics and business analytics as an umbrella of offerings while others believe that it is concerned with the collection, storage, retrieval and exploration of large chunks of data to enable better decision making and planning in almost all spheres of organizational activity. WHY IS BIG DATA NEEDED The Oct 2012 issue of Harvard Business Review proclaimed Big Data as the new management revolution and cites the example of the Amazon who used predictive algorithms to predict customer buying and effectively put many others in the same niche out of business. Since decision making based on Big Data is based on facts it does not just depend on HiPPO- the highest-paid person’s opinion. This makes the quality of decisions better and more appropriate. 3 Organizational performance has always churned out large chunks of data that require silos for filing and methodologies to make sense of it. The availability of lower-cost hardware, software and support makes it more feasible to retrieve and process information, quickly and at lower costs than ever before. The world is moving from ‘Traditional analytics’ to ‘Predictive analytics’ and now increasingly towards ‘Prescriptive analytics’ (where the decisions are driven by 3 McAfee & Brynjolfsson, Oct 2012 issue, “Big Data: the management revolution” , Harvard Business Review, < https://hbr.org/2012/10/big-data-the-management-revolution/ar>
  • 4.
    predictive models usingbusiness rules engines to help the companies to decide the “next best action”).4 Analytics is the natural result of four major global trends: Moore’s Law (which says that technology always gets cheaper), and the three other components of SMAC (Social media, mobile, analytics, cloud) - social media, mobile technology and cloud computing options. Traditional data management and analytics software and hardware technologies, open- source technology, and commodity hardware are also merging to create new alternatives for IT and business executives to take advantage of the next generation of analytics. This has been due to the exponential increase in the 3Vs of data- volume, velocity and variety. This is due to wide range of data sources from primary/secondary research, location data, image data, process data, supply chain data, ERP(Enterprise Resource Planning)/MIS (Management Information System) data, BYOD (Bring Your Own Device) data etc that all have a say in business outcome. Big data analysis/ business analytics is being used in almost all spheres of activity to: 1. Improve all operational efficiencies by reducing waste and better optimization of resource usage; 2. Increase revenue by better forecasting and predicting customer behavior; and 3. Achieving competitive differentiation by better data driven decisions. Predictive analytics has been used or is being looked at to discover new resources- natural/energy, predict consumer credit scores, assess health risks, detect fraud, predict disasters, target high potential leads and to create moments of customer delight by giving the customer what he/she wants before they know it. Over half of the 1,217 global firms surveyed by TCS (Tata Consultancy Services), had undertaken Big Data initiatives in 2012, and of those 643 companies, 43% predicted a return on investment (ROI) of more than 25%. The median spending on Big Data by Indian companies is expected to rise from the current $9.5 million to $12.5 million by the end of 2015.5 Leading organizations are not just integrating data into their analysis and decision making but also using it to design more effective products and services. (Refer Figure 2) 4 “The SMAC Code” report by KPMG & CII, 2013, <https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/The-SMAC-code- Embracing-new-technologies-for-future-business.pdf > 5 “The Emerging Big Returns on Big Data” report by TCS, 2013, <http://www.tcs.com/SiteCollectionDocuments/Trends_Study/TCS-Big-Data-Global-Trend-Study- 2013.pdf >
  • 5.
    Figure 2: GlobalBusiness Intelligence Market Size and Growth Y-OY by Technologies Figure 2: Global Intelligence/BI market (http://www.cloudcomputing-news.net/news/2014/jul/10/ roundup-of-analytics-big-data-business-intelligence-forecasts-and-market-estimates-2014/)
  • 6.
    VERTICAL WISE IMPLEMENTATIONEXAMPLES6 RETAIL Big data methodologies can greatly improve marketing, merchandising, operations, supply chain and after-sales service by offering better insights. They can be used in inventory management, promotional analysis, store operations etc. The US-based book retailer, Barnes & Noble used a big data analytics solution to enable suppliers to monitor its inventory and take real-time replenishment decisions. Big data can also be used to better understand the target market, measure & understand consumer behavior, understand the preferences of potential customers and hence design better offerings and campaigns. HEALTHCARE There is a challenge of managing large amounts of unstructured data and faces a serious three pronged challenge in terms of volume, variety and velocity (high rates of generation). The main applications are preventive healthcare, drug discovery and electronic health records. Some of the largest integrated delivery networks in the USA such as Cleveland Clinic, MedStar, University Hospitals, St. Joseph Health System, Catholic Health Partners and Summa Health System have successfully been using the big data platform for real-time exploration, performance and predictive analytics of clinical data. TELECOM Telecommunications involves trillions of small transactions on a daily basis that offer insights in their own right. Globally, telecom operators are deploying big data tools to make sense of the silos of data for some years now. This has helped operators to improve quality of service by increasing customer satisfaction in shorter times, and thereby making the business more profitable. The emergence of cloud-based open source platforms, coupled with big data has also resulted in faster processing and analysis of data with economical costs. Bharti Airtel, the largest operator in India, uses analytics to enable the marketing department to deliver targeted campaigns to high potential customers on a daily basis. The analytics function processes more than five billion transactions daily, contributing significantly to the top line. GOVERNMENT/PUBLIC SECTOR 6 “Six Converging Technology Trends” Report by KPMG & NASSCOM, 2013, <https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/Six-Converging-tech- trends.pdf >
  • 7.
    The sheer volume,variety and velocity of data in this vertical make data management an onerous task. More time is spent on collection than in analysis which is counterproductive in a way. If analyzed in the right manner, this data has the potential to create better decisions and address some of the biggest challenges that plague them. To make the most of this data flood, governments across the world are now building significant big data roadmaps. Big data now has a significant role to play in the delivery of public services, defense & security, managing transportation and logistics, science, R&D (Research & Development) etc. The Indian government has been making extensive use of big data tools to power its Aadhaar project. To capture and process such significant volumes of data, UIDAI (Unique Identification Authority of India) runs three duplication servers powered by MySQL and Hadoop. It has become the world’s largest biometric ID system and is used for ID proof, subsidy delivery, opening a bank account and other essential services and is accepted throughout India. FINANCIAL SERVICES The financial services industry is amongst the most data driven industry verticals. They have to store and analyze several years of transaction data as per laws. Electronic trading means that capital markets firms generate billions of messages every day. The increasing proliferation of social media is driving banks to keep track of their customers on these platforms. Estimates put the figure at over 80 percent of the data in this industry is unstructured. They are other uses for big data like fraud detection, credit worthiness, cross selling, customer retention etc. Santander Bank in Spain sends out weekly lists of customers who it thinks may be attracted to particular offers from the bank, such as insurance, to its branches. In Singapore Citigroup keeps an eye on customers' credit card transactions for opportunities to recommend them discounts in restaurants nearby based on their tastes and preferences. That sometimes gave the bank an additional second transaction. By using big data on huge volumes of complex business data directly in the data warehouse, the Chinese bank CITIC was able to uncover useful information & predictions. It was able to use data across a dedicated credit card centre setup especially for leveraging customer insights across the other bank departments. TRAVEL/TRANSPORTATION7 Big data has uses and has found application in better traffic control, smarter roads, intelligent cars etc. 7 “The SMAC Code” report by KPMG & CII, 2013, <https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/The-SMAC-code- Embracing-new-technologies-for-future-business.pdf >
  • 8.
    Red Bus- anIndian online travel firm, successfully used Google’s BigQuery to analyze booking and inventory data to create business advantage across their hundreds of bus operators serving more than 10,000 routes. The company chose BigQuery over Hadoop servers, which required more set up time as well as higher operating costs. OTHER EXAMPLES OF IMPLEMENTATION8  Google used big data to predict the next wave of influenza  IBM used data to optimize traffic flow in the city of Stockholm, and to get the best possible levels of air quality  Dr. Jeffrey Brenner, a physician in New Jersey, used medical billing data to map out hot spots where he found his city’s most complex and costly healthcare cases as part of a program to lower healthcare costs  The National Center for Academic Transformation used data mining to help understand which college students are more likely to succeed in which courses  CBG Health Research, a public-sector research organization in New Zealand, created the HealthStat research tool, which enables primary health organizations to identify trends—such as flu or gastroenteritis outbreaks—in real time. Also in New Zealand, the Ministry of Social Development is using data to design targeted programs for at-risk populations.9 CHALLENGES There are many challenges to the successful implementation of a big data initiative:  The reluctance by potential customers to share data and be studied is a concern in India  The quality of the data that is being used to draw insights is questionable sometimes  Privacy and Security concerns and issues  Data migration and portability issues  Lack of trained manpower of an acceptable quality  High levels of organizational inertia to change  Challenges of data visualization- there is a need for powerful visualization in structured dashboards that is improving as we speak  Scalability for size, speed and complexity issues needs to be managed  Data relevance, correlation and connectivity issues have to be taken care of 8 “The Global Information Technology Report 2014” by WEF in collaboration with INSEAD & Cornell University, < http://www3.weforum.org/docs/WEF_GlobalInformationTechnology_Report_2014.pdf > 9 “The Global Information Technology Report 2015” by WEF in collaboration with INSEAD & Cornell University, < http://www3.weforum.org/docs/WEF_Global_IT_Report_2015.pdf >
  • 9.
     Ensuring humanuser interface stays in the loop especially in macro decisions  Personalization & Customization to user needs  As per the Network Readiness Index (NRI) 2015 in the GITR 2015 (Global Information Technology Report) by WEF, India was 89 out of 148 nations surveyed. It was behind smaller neighbors like Bhutan, Sri Lanka, Mongolia, Thailand etc. The NRI benchmarks the ICT readiness and usage of their economies.10  There are difficulties to tackle unstructured data in various sectors like telecom, retail and banking  Even leveraging open source like Hadoop has hidden costs like hiring/training costs, need to upgrade to enterprise versions and enterprise level servers that are paid may be required for mission critical solutions.  The massive data generation by the Internet of Things/ Internet of Everything is a huge challenge  Cognitive analytics will automate analytics in the future through bionic brains that will get smarter with each iteration but the question is whether they will replace the human element or not11  Machine learning will guide automation  Inaccuracy in forecasting still remains a big concern for big data analytics  Smartphone access to big data will be critical in the future as they are fastest adopted technology trend ever are the computing units of the future. Internet Protocol (IP) enabled sensors like RFID (Radio Frequency Identification) tagging is inexpensive and will be extensively used instead of manual data punching12  Big Data results on the bottom line take time to show: there are few immediate results WHAT IT MEANS FOR INDIA India’s Big Data market is set to touch $ 1bn in 2015 with a CAGR (Compound Annual Growth Rate) of almost twice the global CAGR for this space as per industry estimates. Key gainers will be the BPO (Business Process Outsourcing) sector and those offering data driven decision making solutions. The customized education sector will also see a major impact that will match study majors to the student’s unique interests and skill sets. There is vast scope of the applicability of Big Data in e-governance in all its spheres: government to citizens, government, employees, and business. Many projects in rural and 10 “The Global Information Technology Report 2015” by WEF in collaboration with INSEAD & Cornell University, < http://www3.weforum.org/docs/WEF_Global_IT_Report_2015.pdf > 11 Thor Olavsrud, 9 Feb 2015, “ 8 Analytics Trends to watch in 2015”, CIO, <http://www.cio.com/article/2881201/data-analytics/8-analytics-trends-to-watch-in-2015.html > 12 Philip Evans, 7 May 2015, “ 5 Key figures about Big Data”, OpenMind, <https://www.bbvaopenmind.com/en/5-data-about-big- data/?utm_source=twitter&utm_medium=techreview&utm_campaign=MITcompany&utm_content=5bigda ta >
  • 10.
    urban areas inIndia are underway. Presently over 1000 e-governance services are operational in India as a part of NeGP (National e-Governance Plan).13 With the Digital India program announced by the Indian Prime Minister Mr Narendra Modi, big data could provide the brain for the digital engine. These could be the backbone for e-governance & services on demand by citizens and for digitally empowering them. The vast amounts of data collected could only by made sense of by big data initiatives. NEW APPLICATIONS OF BIG DATA14 Drug discovery could be accelerated with machine learning using Big Medical Data which could result in savings of years of research time. This could be due to the use of computer model simulations and the Radins have done just that with a computer algorithm called twoXAR.15 Sports training and personalized medicine are touted as the next big thing in Big Data. These would enhance athletic performance and general health of people. Using tracking techniques, sport geniuses are being studied to help raise the level of performance of other athletes by breaking it down to data sets that are measurable. THE CLOUD & BIG DATA16 Businesses of any can unlock the potential of big data in cloud environments and derive significant competitive advantage. Cloud delivery models offer exceptional flexibility and economic advantages that enable the big data adopter to easily launch and build on it. Private clouds can offer a more efficient and economical model to implement big data analysis in-house, along with the public cloud options. This hybrid cloud option enables companies to use on-demand storage space and computing power via public cloud services for certain short term projects and provide added capacity and scale on demand. Privacy and data security concerns will drive the adoption of the appropriate model with most of the sensitive information hosted on a private cloud. 13 Kalbande, Deshpande & Popat, March 2015, “Review paper on use of Big Data in e-Governance of India”, International Journal for Research in Emerging Science and Technology, Volume2, Special Issue 1,< http://ijrest.net/downloads/volume-2/special-issue-1/pid-m15ug642.pdf > 14 www.datanami.com 15 http://www.twoxar.com/ 16 “ Big Data in the cloud: converging technologies”, April 2015, Intel IT Center, <http://www.intel.in/content/dam/www/public/us/en/documents/product-briefs/big-data-cloud- technologies-brief.pdf >
  • 11.
    The full rangeof services is available with cloud-based AaaS (Analytics as a Service) from data delivery and management to data usage with all the key capabilities that are required from big data analysis/business analytics. WHAT ABOUT HADOOP17 Hadoop is an open-source software framework based on Java developed by Apache Software Foundation for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs. Hadoop is one of the most searched terms in Google Trends in big data analysis currently and is one of the most popular frameworks used for Big Data. Enterprise level versions are also available now on payment. Its key benefits are:  Distributed computing model with quick processing  High degrees of flexibility and preprocessing of data is not required and storage includes unstructured data like text, videos, pictures, etc.  Data and application processing is protected against hardware failure  Low cost: It is free and uses commodity hardware  Scalability is easy Hadoop came into the mainstream of the business world in 2010 and the product lifecycle may be leaving the maturation phase into a possible decline. However, it is preferred for unstructured data analysis, predictive customer analytics, sentiment analysis etc. 18 Its biggest issues seem to be:  Widely acknowledged talent short fall  There are some data security concerns that need to be addressed in the open source version  Tools for data quality and standardization are lacking  Programming via the tool MapReduce does not resolve all problems especially those for iterative and interactive analytic tasks  Full feature tools for data management, data cleansing, governance and metadata are not very user friendly 17 http://www.sas.com/en_us/insights/big-data/hadoop.html 18 James Kobielus, 3 March 2015, “ Hadoop is probably as mature as it’s going to get”, InfoWorld- Extreme Analytics, < http://www.infoworld.com/article/2891832/big-data/hadoop-is-probably-as-mature-as-its- going-to-get.html >
  • 12.
    Google’s Cloud Dataflowand Apache Spark are seen by some as threats to Apache Hadoop though it is yet to happen and Hadoop continues to hold its own. AUTOMATION AND BIG DATA IBM Watson Analytics offers smart data discovery options without the complexity hassles generally associated with Big Data. It is a smart cloud based data discovery service that does data exploration, automates predictive analytics and enables easy dashboard and info graphic creation. All of this is automated and completed in minutes. It takes away the complex mathematics and coding for analytics and does it for you. There are three editions namely free, personal and professional editions. The free edition can have upto 100,000 rows per data set, 50 columns per data set, included storage is 500mb, and it can upload delimited and excel files. 19 To stay competitive as data continues to grow exponentially, one would need Artificial Intelligence (AI) to derive maximum value from the data within the short time deadlines. This is where machine learning or what is popularly called artificial intelligence comes in. It will add an intelligence layer to big data to handle large numbers of complex analytical tasks much faster, without human intervention and with lesser margins of error than humans could ever hope to. AI can perceive its environment and know what to do. It is capable of ‘deep learning’ that is instead of humans telling them what to do; they will figure it out themselves based on the data.20 CONSULTING IN BIG DATA There is a great potential for business opportunities in the big data space that revolve around consulting advice on software and hardware issues. IDC (International Data Corporation) has forecast that the global big data technology and services market will grow at a CAGR of 27% to over $32 billion through 2017. That's six times the growth rate of the overall ICT (Information and Communications Technology) market. Systems integrators, VARs (Value Added Resellers), and service providers are poised to take advantage of this market with big data consulting services and hardware resale.21 19 http://www.ibm.com/analytics/watson-analytics 20 Maycotte, 16 Dec 2014, “ Why Big Data and AI need each other- and you need them both”, Forbes, < http://www.forbes.com/sites/homaycotte/2014/12/16/why-big-data-and-ai-need-each-other-and- you-need-them-both/ > 21 Crystal Bedell, Dec 2013, “Big Data Consulting services top analytics opportunities for channel” , TechTarget, < http://searchitchannel.techtarget.com/feature/Big-data-consulting-services-top-analytics- opportunities-for-channel >
  • 13.
    There are alsoopportunities in strategic consulting in Big Data as more organizations take a strategic perspective to it. All major IT channel companies in this space have taken the approach of a solution provider to provide optimal business value to clients rather than just being product/service sellers. They are using a one stop shop perspective for the complete project execution including strategic advice through a consulting model. Consultants must take a problem solving approach to Big Data with a strategic perspective. The client needs to select the problem that needs fixing and the consultant needs to design the Big Data solution to resolve the problem. This is the right way rather than being enamored by a tool/technology and then looking for the problem that it can solve. Competitive insights into what competitors and market leaders are doing help bring a good perspective to technology selection. This information may not be available to the client and the consultant may be of assistance. The consultant firm’s combined expertise and experience in Big Data especially in a relevant business domain will be of immense strategic value to a client in decision making. The consultant can help the client is selecting appropriate technologies/ methodologies and hardware for procurement, installation, training and maintenance. They could offer the optimal solution that takes into account existing software and hardware infrastructure. Competitive procurement analysis and execution can result in a lot of savings for a client. Consultants can guide the client organization through the learning curve of the technology as they come to grips with it. Data migration, portability and integration are all concerns that a consultant can address. The intellectual and experiential capital of the consultant can make all the difference for the execution and maintenance of a Big Data project. PUTTING IT ALL TOGETHER The lure of big data cannot be ignored. It can be useful for virtually any sphere of organizational activity and cost effective models like the open source and cloud hosting do not offer onerous entry barriers and make its adoption very attractive. The biggest problem seems to be the quality of data that is questionable and leads to inaccurate predictions and analysis. Public relations exercises must be undertaken to educate the customer in terms of assuring him of his/her privacy and the provision of complete data security. Once the data reflects the actual situation correctly, the business analytics forecasts would be more on target. Rather than an all-out adoption approach even for those with deep pockets, a strategic approach would be a better for the bottom-line. One of the most attractive new segments would be marketing analytics that would predict customer buying behavior with a high degree of accuracy. Manufacturing and supply chain management departments are also users of big data analysis to better manage their processes and optimize them. Some of the other applications have already been covered earlier.
  • 14.
    The only downsideand major concerns are the loss of human jobs but they can be redeployed more effectively and humans would still be required till machines with artificial intelligence take over business analytics completely. The Indian Big Data market is slated for a heavy double digit growth figure and with India going digital, big data analysis techniques would be required to make sense of the huge mountains of data. Mainstreaming of Big Data has not yet happened and is unlikely to happen as it is a specialized domain. Many understand the advantages of Big Data but have the misconception that costs are prohibitive and are clueless about how to do it. They do not undertake the project because they feel that they are not geared for it. This is where the specialist/consultant is of immense value. Part of the work of his team would include awareness building and customer education to create interested leads and also to address customer concerns. Tracking of appropriate metrics and ROI (Return on investment) monitoring is critical as each Big Data project must be outcome based. With the open source technologies available, high spending power is not required for Big Data adoption.