SlideShare a Scribd company logo
1 of 58
Download to read offline
AARFA KHAN
2019-12-21
大数据的介绍及案例分享
Big Data Introduction
• Big Data concept
• The difference between big data
and traditional data?
• Typical features of large data (4V)
• Why Big data
• Benefits of Big data
CONTENTS
目录
What is “BIG DATA”?
What is it that we are really talking about?
• “Big Data is the frontier of a firm's ability to
store, process, and access (SPA) all the
data it needs to operate effectively, make
decisions, reduce risks, and serve
customers.”
• -- Forrester
“Big data is data that exceeds the processing
capacity of conventional database systems. The
data is too big, moves too fast, or doesn't fit the
strictures of your database architectures. To gain
value from this data, you must choose an
alternative way to process it.”
-- O’Reilly
• “Big data is the data characterized by 3
attributes: volume, variety and velocity.”
-- IBM
“Big data is the data characterized by 4 key
attributes: volume, variety, velocity and value.”
-- Oracle
• Let’s look at Big Data in a different
way.
• Let’s try again…
Different Units of Data Size
1 Byte= 8 Bit
1 KB =1,024 Bytes = 8192 Bits
1 MB= 1,024 KB = 1,048,576 Bytes
1 GB=1,024 MB = 1,048,576 KB
1 TB=1,024 GB = 1,048,576 MB
1 PB= 1,024 TB = 1,048,576 GB
1 EB= 1,024 PB = 1,048,576 TB
1 ZB= 1,024 EB = 1,048,576 PB
• Big Data is not about the size of
the data, it’s about the value within
the data.
• Our Society is leaving behind a
digital footprint.
• We are generating huge amount of
data(Data with a lot of information).
• …..and a lot of noise.
Unexpected
discoveries…..
Big data concerns continues to rise - Baidu search index trend
大数据关注度持续攀升-百度搜索指数趋势
Big data concerns continues to rise - Baidu search index trend
大数据关注度持续攀升-百度搜索指数趋势
Big data concerns continues to rise - Baidu search index trend
大数据关注度持续攀升-百度搜索指数趋势
大数据关注度的地域分布-Top 10城市
Geographical distribution of Big data – Top 10 city
大数据关注度的人群分布-20~40岁男性最为关注
Crowd distribution of big data attention - 20~40 years old men are most concerned
Big Data Concept (大数据的概念)
Simply to say, is to use the existing general
technology is difficult to manage the data.
Large data (Big Data) refers to the existing
software tools can not be extracted, stored,
search, share, analysis and processing of
massive, complex data collection
大数据(Big Data)是指无法用现有的软件工具
提取、存储、搜索、共享、分析和处理的海
量的、复杂的数据集合
简单一点的说,就是用现有一般技术难以管理
的数据。
大数据长什么样?
China has built the world's largest video surveillance network
我国已经建成世界最大规模的视频监控网
2005 began to start Safe City Project2005 年开始启动平安城市工程
• 每个高清摄像头每小时
4GB数据
• 全市超过60万摄像头
• 视频保存3个月
 需5000PB存储空
间 - 约1万亿张5
兆大小高清照片
 需要600亿元投资
-约西藏一年GDP
总值
Tianjin city Example 天津市为例
What Kind of Big Data?
• 4GB per HD camera hourly
data
• City has more than 600,000
Cameras
• Videos saved for 3 months
 5000PB storage space -
about 1 trillion photos
of 5 megabytes
 60 billion RMB
investment - about 1
year GDP of Tibet
2016年春节微信大数据报告-人口迁徙
微信发布了《2016微信春节大数据报告》,该报告以6.5亿用户
为样本,还原了从猴年除夕到元宵,中国人怎么过春节。
2016 Spring Festival WeChat big data
report – Population wise
WeChat released the 2016 WeChat Spring Festival big data
report, The report takes a sample of 650 million users, From
the monkey New Year's Eve to the Lantern festival, Chinese
people how to spend the spring festival.
大数据长什么样?
What Kind of Big Data?
2016年中国网民及数据概览
淘宝-每天交易超过数千万笔,单日数据产生
量超过50TB,峰值10万/分钟
百度-每天处理60亿次搜索请求,存储1万亿
网页信息,总量超过1000PB
腾讯 - QQ月活跃用户超过8亿,微信超过5亿
,在线人际关系链超1000亿,每天1千亿次的
服务器调用
2016 Chinese Internet users and data overview
Taobao - more than tens of millions of
transactions per day, single day data
generated over 50TB, peak 100 thousand
/ min
Baidu - 6 billion times a day to deal with
search requests, storage of 1 trillion web
pages, the total amount of more than
1000PB
Tencent - QQ monthly active users more
than 800 million, more than 500 million
WeChat, online human relations chain
over 100 billion, the server calls every
day 100 billion times
大数据长什么样?
What Kind of Big Data?
Difference Between Big Data
and Traditional Data?
Difference between Big Data and Traditional data?
大数据与传统数据的区别?
传统数据:要回答的问题是他第一
百零一次买不买书,即业绩和经营
指标的问题;
大数据:要回答的是他第一百零一
次买什么书,需要将什么样的内容
推荐给他。
小明去了一百次书店
01
传统定义上,更多关注的是一类人
群,用同一类规则制订套餐给他们;
互联网时代,要把每个人都精准刻
画出来,进行精准匹配。
群体和个体的区别
02
Xiao Ming went to the bookstore
01 Traditional data: To answer a question buy or
not buy 101st book, that issues relating to
performance and operating indicators;
Big data: to answer a question for101st what
book to buy, what kind of content to
recommend to him
Traditional definition, more attention is a class of
people, with the same kind of rules set for them.
Internet era, we must accurately describe each
person, the exact match.
Difference : groups and individuals
02
Features of Big Data (4V)
大数据的4V特征
4V Features of Big Data
大数据的4V特征 - 体量大
过去10年数据总量增加了60倍! 未来5年数据增速比过去5年提高4倍!
4V Features of Big Data - Volume
Over the past 10 years the total amount of data
increased by 60 times!
The next 5 years data growth over the
past 5 years increased by 4 times!
大数据的4V特征 - 变化快
4V Features of Big Data - Velocity
大数据的4V特征 - 多样性
4V Features of Big Data - Variety
大数据的4V特征 - 价值密度低
不可再生资源
4V Features of Big Data - Veracity
Nonrenewable resources
大数据从哪里来?
独立运营的企业级大数据解
决方案提供商
From where Data is Coming?
Big Data Sources
Data generation points
Examples
Mobile Devices
Microphones
Readers/Scanners
Sciencefacilities
Programs/ Software
Social Media
Cameras
Why Big Data
• Growth of Big Data is needed
– Increase of storage capacities
– Increase of processing power
– Availability of data(different data types)
– Every day we create 2.5 quintillion bytes of data; 90% of the data in the
world today has been created in the last two years alone
Why BigData
 300 billions emails are sent daily.
 500 millions tweets are sent daily.
 IBM claims 90% of today’s stored
data was generated in just the
last two years.
Benefits Of "BIG DATA”?
大数据催生思维变革
1、更多 - 由传统的随机样本预测,到全体预测的转变
当数据处理技术已经发生了翻天覆的变化时,在大数据时代进行抽样分析就像在汽车时代骑马一样。一切
都改变了,我们需要的是所有的数据,“样本=总体”
1, more - Traditional random sample prediction, to the overall prediction of the change.
When the data processing technology has undergone a dramatic change, in the large data age
sampling analysis as in the automobile age riding the same. Everything has changed, we need all
the data, "sample = overall“
2、更杂 - 不再是精确性,而是混杂性
“大数据”通常用概率说话,而不是板着“确凿无疑”的面孔。大数据要求我们有所改变,我们必须能够
接受混乱和不确定性
2. More complex - no longer accuracy, but the hybrid
"Big data" usually speaks in probability, not in the face of "conclusive". Big data requires us to
change, and we must be able to accept chaos and uncertainty
3、更好 - 不是因果关系,而是相关关系
知道“是什么”就够了,没必要知道“为什么”。在大数据时代,我们不必非得知道现象背后的原因,而
是要数据自己“发声”
3. Better - not a causal relationship, but a correlation
Know "what" is enough, no need to know "why." In the era of large data, we do not have to know
the reasons behind the phenomenon, but to their own data, "voice"
Big Data is changing thinking
大数据带来商业变革
there are three large data companies:
•Companies based on the data itself (data owners):
have the data, do not have the ability to analyze the
data;
•Technology - based companies (technology
providers): technology providers or data analysis
companies;
•Thinking-Based Companies (Service Providers): Big
Data Application Companies for Mining Data Value;
Big Data brings business change
在大数据产业链上有三种大数据公司:
•基于数据本身的公司(数据拥有者):拥有数据,不具有
数据分析的能力;
•基于技术的公司(技术提供者):技术供应商或者数据分
析公司等;
•基于思维的公司(服务提供者):挖掘数据价值的大数据
应用公司;
大数据带来新的挑战
挑战二:企业内部数据孤岛严重
挑战一:业务部门没有清晰的大数据需求
挑战三:数据可用性低,数据质量差
挑战四:数据相关管理技术和架构
挑战五:数据安全
挑战六:大数据人才缺乏
挑战七:数据开放与隐私的权衡
New Challenges
One: business sector does not have a clear big data
needs
Two: enterprise internal data silos serious
Three: data availability is low, data quality is poor
Four: data related management technologies and
Architectures
Five: Data Security
Six: the lack of big data talent
Seven: the tradeoff between data openness and
privacy
THANK YOU!
谢谢!

More Related Content

Similar to Big Data v2.pptx

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Big Data in Asia
Big Data in AsiaBig Data in Asia
Big Data in AsiaTom Simpson
 
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)Cisco Service Provider Mobility
 
Big Data - Everything you need to know
Big Data - Everything you need to knowBig Data - Everything you need to know
Big Data - Everything you need to knowV2Soft
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfAnil
 
Democratizing Big Data
Democratizing Big DataDemocratizing Big Data
Democratizing Big DataJeff Kelly
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small DataHurwitz & Associates
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Big data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nnBig data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nnCathy McKnight
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationDoug Denton
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightSunil Ranka
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeckActian Corporation
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 

Similar to Big Data v2.pptx (20)

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big Data in Asia
Big Data in AsiaBig Data in Asia
Big Data in Asia
 
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
Unlocking Value in the Fragmented World of Big Data Analytics (POV Paper)
 
CDOVision - RJA Presentation FINAL
CDOVision - RJA Presentation FINALCDOVision - RJA Presentation FINAL
CDOVision - RJA Presentation FINAL
 
Big Data - Everything you need to know
Big Data - Everything you need to knowBig Data - Everything you need to know
Big Data - Everything you need to know
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Thilga
ThilgaThilga
Thilga
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
Democratizing Big Data
Democratizing Big DataDemocratizing Big Data
Democratizing Big Data
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small Data
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Big data
Big dataBig data
Big data
 
Big data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nnBig data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nn
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentation
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to Foresight
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeck
 
Big Data
Big DataBig Data
Big Data
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 

Recently uploaded

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 

Recently uploaded (17)

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 

Big Data v2.pptx