SlideShare a Scribd company logo
DATA VISUALIZATION
INSIGHT BEHIND DATA
Mukul Taneja
Data Specialist,
Gramener
WHAT IS BIG DATA ?
 Volume: Organizations collect data from a variety of sources,
including business transactions, social media and information from
sensor or machine-to-machine data. In the past, storing it would’ve
been a problem – but new technologies (such as Hadoop) have
eased the burden.
 Velocity: Data streams in at an unprecedented speed and must be
dealt with in a timely manner. RFID tags, sensors and smart
metering are driving the need to deal with torrents of data in near-
real time.
 Variety: Data comes in all types of formats – from structured,
numeric data in traditional databases to unstructured text
documents, email, video, audio, stock ticker data and financial
transactions.
BIG DATA also is…
 Extremely large data sets that may be analysed computationally to
reveal patterns, trends, and associations, especially relating to
human behaviour and interactions
 Which data is “BIG”?
• Google Services
• Social Media
• E-Commerce
• Geo-location and many more…
WHY CARE FOR DATA ?
DATA IS EVERYWHERE.
As users are continuously increasing, amount of data is getting
massive.
HOW TO USE DATA ?
To predict the user behavior…
HOW TO USE DATA ?
 To make connectivity better…
HOW TO USE DATA ?
 To provide better products to buy….
And the whole idea is ….
To give user…
 Better experience
 Accuracy
 Relevancy
 Productivity
 Opportunity
 Reliability
WHAT TO DO WITH DATA ?
IS
EVERYWHERE
ANALYSIS
DATA
VISUALIZE
WHY SHOULD WE VISUALIZE THE
DATA?
We humans do not understand the
language of raw DATA
“ We have internal information. Getting
information from outside is our
challenge. There’s no way of doing
that.
– Senior Editor
Leading Media Company
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
Low effort High effort
High effort
Low effort
Creator
Consumer
There are many ways to aid data
consumption
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
RELIGIONS IN INDIA
RELIGIONS IN AUSTRALIA
LET’S TAKE TESCO’S GROCERIEScategory title kJ rate
dairy Activia Pouring Natural Yogurt 1X950g 216 0.21
dairy Activia Pouring Strawberry Yogurt 1X950g 250 0.21
dairy Activia Pouring Vanilla Yogurt 1X950g 263 0.21
icecream Almondy Daim 400G 1804 0.75
icecream Almondy Toblerone 400G 1850 0.5
cereals Alpen 10 Pack Lite Summer Fruits Cereal Bars 210G 1222 1.57
cereals Alpen 10Pk Fruit Nut And Chocolate Cereal Bars 290G 1812 1.14
cereals Alpen Coconut And Chocolate Cereal Bars 5Pk 145G 1863 1.24
cereals Alpen Fruit And Nut With Chocolate Cereal Bar 5X29g 1812 1.24
cereals Alpen High Fruit 650G 1439 0.4
cereals Alpen Light Bars Chocolate And Orange 5X21g 1246 1.71
cereals Alpen Light Chocolate And Fudge Bar 5X21g 1264 1.71
cereals Alpen Light Sultana & Apple Bars 5Pk 105G 1197 1.71
cereals Alpen Light Summer Fruits Bars 5Pk 105G 1222 1.71
cereals Alpen No Added Sugar 1.3Kg 1488 0.31
cereals Alpen No Added Sugar 560G 1488 0.46
cereals Alpen Original 1.5Kg 1509 0.27
cereals Alpen Original Muesli 750G 1509 0.35
cereals Alpen Raspberry And Yoghurt Cereal Bars5x29g 1748 1.24
cereals Alpen Strawberry With Yoghurt Cereal Bar 5X29g 1756 1.24
dairy Alpro Natural Yofu 500G 0.28
dairy Alpro Raspberry Vanilla Yofu 4X125g 0.35
dairy Alpro Strawberry And Fof Soya Yofu 4X125g 0.35
The Shawshank
Redepmption
The Godfather
The Dark Knight
Titanic
The Phantom Menace
Twilight
New Moon
Wild Wild West
Transformers
The Good, The Bad,
The Ugly
12 Angry Men
7 Samurai
Taare Zameen Par
Rang De Basanti
Yojinbo
MORE VOTES
BETTER RATED
Many unwatched movies
Few unwatched movies
Mix of watched & unwatched
Few watched movies
Many watched movies
MOVIES ON THE IMDB
3 Idiots
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
Simplifying access to data is a big win
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
EDUCATION
PREDICTING MARKS
What determines a child’s
marks?
Do girls score better than boys?
Does the choice of subject
matter?
Does the medium of instruction
matter?
Does community or religion
matter?
Does their birthday matter?
Does the first letter of their
name matter?
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
TN CLASS X: ENGLISH
TN CLASS X: SOCIAL SCIENCE
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
TN CLASS X: MATHEMATICS
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
… to make the hidden obvious
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
Let’s look at 15 years of US Birth Data
This is a dataset (1975 – 1990) that has
been around for several years, and has
been studied extensively. Yet, a
visualization can reveal patterns that
are neither obvious nor well known.
For example,
• Are birthdays uniformly distributed?
• Do doctors or parents exercise the C-section option to move dates?
• Is there any day of the month that has unusually high or low births?
• Are there any months with relatively high or low births?
Very high births in September.
But this is fairly well known. Most
conceptions happen during the
winter holiday season
Relatively few births during the
Christmas and Thanksgiving
holidays, as well as New Year and
Independence Day.
Most people prefer not
to have children on the
13th of any month, given
that it’s an unlucky day
Some special days like April
Fool’s day are avoided, but
Valentine’s Day is quite
popular
More births Fewer births … on average, for each day of the year (from 1975 to 1990)
The pattern in India is quite different
This is a birth date dataset that’s
obtained from school admission data
for over 10 million children. When we
compare this with births in the US, we
see none of the same patterns.
For example,
• Is there an aversion to the 13th or is there a local cultural nuance?
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Very few children are born in the
month of August, and thereafter.
Most births are concentrated in
the first half of the year
We see a large number of
children born on the 5th, 10th,
15th, 20th and 25th of each month
– that is, round numbered dates
Such round numbered patterns a
typical indication of fraud. Here,
birthdates are brought forward to
aid early school admission
More births Fewer births … on average, for each day of the year (from 2007 to 2013)
This adversely impacts children’s marks
It’s a well established fact that older
children tend to do better at school in
most activities. Since many children
have had their birth dates brought
forward, these younger children suffer.
The average marks of children “born” on the 1st, 5th, 10th, 15th etc. of
the month tend to score lower marks.
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013)
Children “born” on round numbered days score lower marks on average,
due to a higher proportion of younger children
SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLORE
and figure it out
Just
EXPOSE
the data to me
SUICIDES IN INDIA
UTTERLY, BUTTERLY, COLORFUL
HOW THE WORLD SEARCHED FOR
TERROR ATTACKS ?
DATAMEET GOOGLE GROUP
BIG DATA HANDY TOOLS
Programming Languages
Python
R
Octave
BIG DATA HANDY TOOLS
Databases
Hadoop
Mondo DB
TeraData
BIG DATA HANDY TOOLS
Front End Javascript Libraries
D3.JS
Dimple Js
QUESTIONS?
THANK YOU

More Related Content

Viewers also liked

NoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value StoreNoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value Store
DATAVERSITY
 
Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)
Basho Technologies
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for Riak
Sean Cribbs
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
Basho Technologies
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
DstroyAllModels
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
Sean Cribbs
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overviewMarc Seeger
 
Microsoft Modern Analytics
Microsoft Modern AnalyticsMicrosoft Modern Analytics
Microsoft Modern Analytics
MSDEVMTL
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
lesterathayde
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentationkt166212
 

Viewers also liked (11)

NoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value StoreNoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value Store
 
Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)Coding with Riak (from Velocity 2015)
Coding with Riak (from Velocity 2015)
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for Riak
 
Relational Databases to Riak
Relational Databases to RiakRelational Databases to Riak
Relational Databases to Riak
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
 
Microsoft Modern Analytics
Microsoft Modern AnalyticsMicrosoft Modern Analytics
Microsoft Modern Analytics
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentation
 

Similar to Data visualization

Data visualization for social problems
Data visualization for social problemsData visualization for social problems
Data visualization for social problems
Gramener
 
Data monetization
Data monetizationData monetization
Data monetization
Gramener
 
New Age Tools in Data Journalism - Analytics & Visualization
New Age Tools in Data Journalism - Analytics & VisualizationNew Age Tools in Data Journalism - Analytics & Visualization
New Age Tools in Data Journalism - Analytics & Visualization
Ganes Kesari
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
Gramener
 
The ultimate guide to data storytelling | Materclass
The ultimate guide to data storytelling | MaterclassThe ultimate guide to data storytelling | Materclass
The ultimate guide to data storytelling | Materclass
Gramener
 
Telling Stories with Data
Telling Stories with DataTelling Stories with Data
Telling Stories with Data
The Communications Network
 
Children with Teen Parents
Children with Teen ParentsChildren with Teen Parents
Children with Teen ParentsAlyssa Bricker
 
Humanizing Data Storytelling for Greater Business Impact
Humanizing Data Storytelling for Greater Business ImpactHumanizing Data Storytelling for Greater Business Impact
Humanizing Data Storytelling for Greater Business Impact
Gramener
 
Parenting In A Media Driven World Slideshare
Parenting In A Media Driven World SlideshareParenting In A Media Driven World Slideshare
Parenting In A Media Driven World Slideshare
chrisweber
 
Prisoners of birth: TEDx Whitefield
Prisoners of birth: TEDx WhitefieldPrisoners of birth: TEDx Whitefield
Prisoners of birth: TEDx Whitefield
Gramener
 
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Gramener
 
Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics
Gramener
 
The value of storytelling through data
The value of storytelling through dataThe value of storytelling through data
The value of storytelling through data
Gramener
 
Legacy Management Systems Slide Show
Legacy Management Systems Slide ShowLegacy Management Systems Slide Show
Legacy Management Systems Slide Show
guestd84642
 
Dreams into nightmares
Dreams into nightmaresDreams into nightmares
Dreams into nightmares
Andrew Campbell
 
Final Nacctep Mooks[1]
Final Nacctep Mooks[1]Final Nacctep Mooks[1]
Final Nacctep Mooks[1]
Regis University
 
Legacy Management Systems Slide Show
Legacy Management Systems Slide ShowLegacy Management Systems Slide Show
Legacy Management Systems Slide Show
guestd334a0a
 
ATTITUDE CHANGE.ppt
ATTITUDE CHANGE.pptATTITUDE CHANGE.ppt
ATTITUDE CHANGE.ppt
ziena2
 
Power School Presentation Shared
Power School Presentation SharedPower School Presentation Shared
Power School Presentation Shared
BrianWoodland
 
Marketing to Seniors
Marketing to SeniorsMarketing to Seniors
Marketing to Seniors
PCG Marketing, LLC
 

Similar to Data visualization (20)

Data visualization for social problems
Data visualization for social problemsData visualization for social problems
Data visualization for social problems
 
Data monetization
Data monetizationData monetization
Data monetization
 
New Age Tools in Data Journalism - Analytics & Visualization
New Age Tools in Data Journalism - Analytics & VisualizationNew Age Tools in Data Journalism - Analytics & Visualization
New Age Tools in Data Journalism - Analytics & Visualization
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
The ultimate guide to data storytelling | Materclass
The ultimate guide to data storytelling | MaterclassThe ultimate guide to data storytelling | Materclass
The ultimate guide to data storytelling | Materclass
 
Telling Stories with Data
Telling Stories with DataTelling Stories with Data
Telling Stories with Data
 
Children with Teen Parents
Children with Teen ParentsChildren with Teen Parents
Children with Teen Parents
 
Humanizing Data Storytelling for Greater Business Impact
Humanizing Data Storytelling for Greater Business ImpactHumanizing Data Storytelling for Greater Business Impact
Humanizing Data Storytelling for Greater Business Impact
 
Parenting In A Media Driven World Slideshare
Parenting In A Media Driven World SlideshareParenting In A Media Driven World Slideshare
Parenting In A Media Driven World Slideshare
 
Prisoners of birth: TEDx Whitefield
Prisoners of birth: TEDx WhitefieldPrisoners of birth: TEDx Whitefield
Prisoners of birth: TEDx Whitefield
 
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
 
Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics
 
The value of storytelling through data
The value of storytelling through dataThe value of storytelling through data
The value of storytelling through data
 
Legacy Management Systems Slide Show
Legacy Management Systems Slide ShowLegacy Management Systems Slide Show
Legacy Management Systems Slide Show
 
Dreams into nightmares
Dreams into nightmaresDreams into nightmares
Dreams into nightmares
 
Final Nacctep Mooks[1]
Final Nacctep Mooks[1]Final Nacctep Mooks[1]
Final Nacctep Mooks[1]
 
Legacy Management Systems Slide Show
Legacy Management Systems Slide ShowLegacy Management Systems Slide Show
Legacy Management Systems Slide Show
 
ATTITUDE CHANGE.ppt
ATTITUDE CHANGE.pptATTITUDE CHANGE.ppt
ATTITUDE CHANGE.ppt
 
Power School Presentation Shared
Power School Presentation SharedPower School Presentation Shared
Power School Presentation Shared
 
Marketing to Seniors
Marketing to SeniorsMarketing to Seniors
Marketing to Seniors
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

Data visualization

  • 1. DATA VISUALIZATION INSIGHT BEHIND DATA Mukul Taneja Data Specialist, Gramener
  • 2. WHAT IS BIG DATA ?  Volume: Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. In the past, storing it would’ve been a problem – but new technologies (such as Hadoop) have eased the burden.  Velocity: Data streams in at an unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near- real time.  Variety: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, email, video, audio, stock ticker data and financial transactions.
  • 3. BIG DATA also is…  Extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions  Which data is “BIG”? • Google Services • Social Media • E-Commerce • Geo-location and many more…
  • 4. WHY CARE FOR DATA ? DATA IS EVERYWHERE. As users are continuously increasing, amount of data is getting massive.
  • 5. HOW TO USE DATA ? To predict the user behavior…
  • 6. HOW TO USE DATA ?  To make connectivity better…
  • 7. HOW TO USE DATA ?  To provide better products to buy….
  • 8. And the whole idea is …. To give user…  Better experience  Accuracy  Relevancy  Productivity  Opportunity  Reliability
  • 9. WHAT TO DO WITH DATA ? IS EVERYWHERE ANALYSIS DATA VISUALIZE
  • 10. WHY SHOULD WE VISUALIZE THE DATA? We humans do not understand the language of raw DATA
  • 11. “ We have internal information. Getting information from outside is our challenge. There’s no way of doing that. – Senior Editor Leading Media Company
  • 12. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me Low effort High effort High effort Low effort Creator Consumer There are many ways to aid data consumption
  • 13. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me
  • 16.
  • 17. LET’S TAKE TESCO’S GROCERIEScategory title kJ rate dairy Activia Pouring Natural Yogurt 1X950g 216 0.21 dairy Activia Pouring Strawberry Yogurt 1X950g 250 0.21 dairy Activia Pouring Vanilla Yogurt 1X950g 263 0.21 icecream Almondy Daim 400G 1804 0.75 icecream Almondy Toblerone 400G 1850 0.5 cereals Alpen 10 Pack Lite Summer Fruits Cereal Bars 210G 1222 1.57 cereals Alpen 10Pk Fruit Nut And Chocolate Cereal Bars 290G 1812 1.14 cereals Alpen Coconut And Chocolate Cereal Bars 5Pk 145G 1863 1.24 cereals Alpen Fruit And Nut With Chocolate Cereal Bar 5X29g 1812 1.24 cereals Alpen High Fruit 650G 1439 0.4 cereals Alpen Light Bars Chocolate And Orange 5X21g 1246 1.71 cereals Alpen Light Chocolate And Fudge Bar 5X21g 1264 1.71 cereals Alpen Light Sultana & Apple Bars 5Pk 105G 1197 1.71 cereals Alpen Light Summer Fruits Bars 5Pk 105G 1222 1.71 cereals Alpen No Added Sugar 1.3Kg 1488 0.31 cereals Alpen No Added Sugar 560G 1488 0.46 cereals Alpen Original 1.5Kg 1509 0.27 cereals Alpen Original Muesli 750G 1509 0.35 cereals Alpen Raspberry And Yoghurt Cereal Bars5x29g 1748 1.24 cereals Alpen Strawberry With Yoghurt Cereal Bar 5X29g 1756 1.24 dairy Alpro Natural Yofu 500G 0.28 dairy Alpro Raspberry Vanilla Yofu 4X125g 0.35 dairy Alpro Strawberry And Fof Soya Yofu 4X125g 0.35
  • 18.
  • 19.
  • 20. The Shawshank Redepmption The Godfather The Dark Knight Titanic The Phantom Menace Twilight New Moon Wild Wild West Transformers The Good, The Bad, The Ugly 12 Angry Men 7 Samurai Taare Zameen Par Rang De Basanti Yojinbo MORE VOTES BETTER RATED Many unwatched movies Few unwatched movies Mix of watched & unwatched Few watched movies Many watched movies MOVIES ON THE IMDB 3 Idiots
  • 21. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me Simplifying access to data is a big win
  • 22. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me
  • 23. EDUCATION PREDICTING MARKS What determines a child’s marks? Do girls score better than boys? Does the choice of subject matter? Does the medium of instruction matter? Does community or religion matter? Does their birthday matter? Does the first letter of their name matter?
  • 24. 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 TN CLASS X: ENGLISH
  • 25. TN CLASS X: SOCIAL SCIENCE 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
  • 26. TN CLASS X: MATHEMATICS 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
  • 27. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me … to make the hidden obvious
  • 28. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me
  • 29. Let’s look at 15 years of US Birth Data This is a dataset (1975 – 1990) that has been around for several years, and has been studied extensively. Yet, a visualization can reveal patterns that are neither obvious nor well known. For example, • Are birthdays uniformly distributed? • Do doctors or parents exercise the C-section option to move dates? • Is there any day of the month that has unusually high or low births? • Are there any months with relatively high or low births? Very high births in September. But this is fairly well known. Most conceptions happen during the winter holiday season Relatively few births during the Christmas and Thanksgiving holidays, as well as New Year and Independence Day. Most people prefer not to have children on the 13th of any month, given that it’s an unlucky day Some special days like April Fool’s day are avoided, but Valentine’s Day is quite popular More births Fewer births … on average, for each day of the year (from 1975 to 1990)
  • 30. The pattern in India is quite different This is a birth date dataset that’s obtained from school admission data for over 10 million children. When we compare this with births in the US, we see none of the same patterns. For example, • Is there an aversion to the 13th or is there a local cultural nuance? • Are holidays avoided for births? • Which months have a higher propensity for births, and why? • Are there any patterns not found in the US data? Very few children are born in the month of August, and thereafter. Most births are concentrated in the first half of the year We see a large number of children born on the 5th, 10th, 15th, 20th and 25th of each month – that is, round numbered dates Such round numbered patterns a typical indication of fraud. Here, birthdates are brought forward to aid early school admission More births Fewer births … on average, for each day of the year (from 2007 to 2013)
  • 31. This adversely impacts children’s marks It’s a well established fact that older children tend to do better at school in most activities. Since many children have had their birth dates brought forward, these younger children suffer. The average marks of children “born” on the 1st, 5th, 10th, 15th etc. of the month tend to score lower marks. • Are holidays avoided for births? • Which months have a higher propensity for births, and why? • Are there any patterns not found in the US data? Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013) Children “born” on round numbered days score lower marks on average, due to a higher proportion of younger children
  • 32. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLORE and figure it out Just EXPOSE the data to me
  • 35. HOW THE WORLD SEARCHED FOR TERROR ATTACKS ?
  • 37. BIG DATA HANDY TOOLS Programming Languages Python R Octave
  • 38. BIG DATA HANDY TOOLS Databases Hadoop Mondo DB TeraData
  • 39. BIG DATA HANDY TOOLS Front End Javascript Libraries D3.JS Dimple Js