Gen AI in Business - Global Trends Report 2024.pdf
BIG DATA – Beyond the Hype
1. BIG DATA – Beyond the Hype
If one were to run a Map Reduce job on a Hadoop Distributed File System (HDFS), consisting of all white-
papers, articles and presentations created in the past couple of years on Business Intelligence &
Analytics, to count the most frequently occurring 2-grams (2 words that occur together), BIG DATA
would most certainly trump others. And so, instead of attempting to define BIG DATA in this blog post, I
would like to focus on the business value of BIG DATA.
For many years, BI practitioners have dealt with structured data – managed and harnessed it for
insights. There have been remarkable improvements in business decision making and Analytics, as the
domain has taken a great leap forward. Nevertheless, the focus has always been on structured data
which is typically 20-25% of data generated by any organization. The rest 75-80% is composed of
unstructured data (Text documents, Files etc.) and hitherto there has been no system / technique /
platform to derive insights from this dataset.
With the advent of BIG DATA techniques (in which Hadoop and Map Reduce play a big part), businesses
for the first time can confidently say that they can build the capability to manage large volumes of data
(terabytes to exabytes to petabytes), different varieties of data (structured, semi-structured and
unstructured), handle ever increasing data velocity and perform complex analysis that have high
variability.
The diagram below illustrates the evolving architectural paradigm of combining structured and
unstructured data analysis. The top half shows the unstructured data architecture while the bottom
layer shows the BI architecture corresponding to structured data analysis. But the real value is in
combining the insights from the top & bottom layers for a variety of use cases that truly enable
organizations to compete on Analytics.
2. There are many interesting aspects in the diagram shown above and we at Hexaware have started
working on proof of concepts for our customers, which combine the structured and unstructured world
of data. Each of the components mentioned above will be explained in subsequent blog posts.
Thanks for reading. Please do provide your feedback.