2. Course Outcomes
CO1: Recall the basic concepts of big data and data science. PO1, PO2, PO12
CO2: Utilize statistical concepts of big data collection, data analysis,
modelling, and inference.
PO1, PO2, PO4,
PO5, PO12
CO3: Identify appropriate data mining algorithms to solve real world
problems.
PO1, PO2, PO3,
PO4, PO5, PO12
CO4: Analyze data, relevant models and tools for respective applications.
PO1, PO2, PO3,
PO4, PO5, PO12
On successful completion of the course, the students will be able to
8. Data Information
each individual homework and test grade
of a student in one class
the student’s average grade for each class
typing the words “cat videos” in your
computer search engine (input)
the list of search results that includes a
variety of cat videos on the internet
(output)
Data Vs Information Examples Chart
9. Knowledge
Information can be converted into knowledge about historical patterns and future
trends
Knowledge and information both basically have data as an essential component.
◦ Both knowledge and information can be identified by observation, stored,
retrieved and processed further.
◦ Information can be processed into knowledge; knowledge can be communicated
as information.
10.
11. Database
A database is a collection of information that is organized so that it can be easily
accessed, managed and updated.
13. Introduction to Big Data
Data which are very large in size is called Big Data.
Normally we work on data of size MB(WordDoc ,Excel) or maximum GB(Movies,
Codes) but data in Peta bytes i.e. 10^15 byte size is called Big Data. It is stated
that almost 90% of today's data has been generated in the past 3 years.
14. Units of data
The bit
The Byte
Kilobyte (1024 Bytes)
Megabyte (1024 Kilobytes)
Gigabyte (1,024 Megabytes, or 1,048,576 Kilobytes)
Terabyte (1,024 Gigabytes)
Petabyte (1,024 Terabytes, or 1,048,576 Gigabytes)
Exabyte (1,024 Petabytes)
Zettabyte (1,024 Exabytes)
Yottabyte (1,204 Zettabytes, or 1,208,925,819,614,629,174,706,176 bytes)
15. Sources of Big Data
Social networking sites: Facebook, Google, LinkedIn all these sites generates huge amount of
data on a day to day basis as they have billions of users worldwide.
E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs from which
users buying trends can be traced.
Weather Station: All the weather station and satellite gives very huge data which are stored and
manipulated to forecast weather.
Telecom company: Telecom giants like Airtel, Vodafone study the user trends and accordingly
publish their plans and for this they store the data of its million users.
Share Market: Stock exchange across the world generates huge amount of data through its daily
transaction.
16. Types Of Big Data
Structured
Unstructured
Semi-structured
21. Definition – Data Warehouse
A Data Warehousing (DW) is process for collecting and managing data from
varied sources to provide meaningful business insights.
25. Contd…
Data mining (knowledge discovery from data)
Extraction of interesting (non-trivial, implicit, previously unknown and
potentially useful) patterns or knowledge from huge amount of data
Alternative names
Knowledge discovery (mining) in databases (KDD), knowledge
extraction, data/pattern analysis, data archeology, data dredging, information
harvesting, business intelligence, etc.
28. Data Analytics
Data analytics (DA) is the process of examining data sets in order
to find trends and draw conclusions about the information they
contain.