This document summarizes the key points from a discussion about the article "Big data: Dimensions, evolution, impacts, and challenges". The discussion focused on how the volume, velocity, and variety of data has increased dramatically with technology allowing more data from various sources to be recorded. Specifically, the document outlines that data is now recorded from online social networks, IoT devices, web usage, web structures, web content including text, images and video. Challenges in analyzing this large amount of unstructured data were also discussed.
1. Dear Students,
This document is related to the learnings of the article “Big data: Dimensions, evolution, impacts,
and challenges” (Discussed in Session 5). We had a very detailed discussion on that article. Based
on the inputs of the students and their continuous involvement in the discussion, we could draw
the following points:
1. We tried to understand the concept of Big Data. In earlier days, data was there but we were
not able to record that data. Now with the advent of the information technology, we are
able to record almost all the activity happening around us.
2. In previous days, only the employees of any organization used to record the data. Then
came the era of online social networks where every user started recording the data. Now
we are living in the age of Internet of Things (IoT) where electronic devices around us are
also recording huge amount of data.
3. The data generated in today’s world are so large in amount (volume of data), are generated
with high speed (Velocity) and are of different types (Variety).
4. Most of the generated data in today’s world are unstructured (videos, audios, images, text)
in nature. The big data technologies have the challenge to convert this unstructured data
into structured one to get the meaningful insights from the data.
5. We discussed about the web mining techniques which are used to analyze users’ online
activities. Web mining can be divided into three different types: web usage mining, web
structure mining, and web content mining.
6. Web usage mining is the application of data mining techniques to discover web users’
usage patterns online. Usage data captures the identity or origin of web users along with
2. their browsing behavior. The ability to track individual users’ mouse clicks, searches, and
browsing patterns makes it possible to provide personalized services to users.
7. Web structure mining is the process of analyzing the structure of a website or a web page.
The structure of a typical website consists of web pages as nodes and hyperlinks as edges
connecting related pages.
8. Web content mining is the process of extracting useful information from the content of web
pages. A web page may consist of text, images, audio, video etc.
9. Text mining has been applied widely to web content mining. Text mining extracts
information from unstructured text and draws heavily on techniques from such disciplines
as information retrieval (IR) and natural language processing (NLP).
10. Sentiment analysis uses text analysis, natural language processing, and computational
linguistics to identify and extract user sentiments or opinions from text materials.
11. The IoT refers to a technology environment in which devices and sensors have unique
identifiers with the ability to share data and collaborate over the internet even without any
human intervention.
12. We left the discussion at a point to find out the types of data around us which are there but
we are not able to record that. One such type of data could be thoughts in our human mind.
Can we record our thoughts and create a business model around that?? Think over it !!
Hope to have similar kind of discussions in the coming sessions as well. Have a great learning!!
Best Regards,
Prof. Manas Tripathi