SlideShare a Scribd company logo
1 of 21
Big Data
Hello!
I am Taimoor Hussain.
Wajid Ali
Daniyal Jan
“
Data is the new science. Big Data holds the
answers.
Pat Gelsinger
Processed data is information. Processed
information is knowledge. Processed
knowledge is Wisdom.
Ankala V. Subbarao
Content
◎ What is Big Data?
◎ Measurements and Estimations
◎ How much data is Big Data?
◎ Big Data Types
◎ Sources of Data
◎ Stages to Big Data
◎ References
What is Big Data??
Big Data
Importance to data not the amount of data!
Large volume of data
 Structured
 Un-Structured
Measurements and
Estimations
Measurements
 Gigabyte : 1024 MB
 Terabyte : 1024 GB
 Petabyte : 1024 TB
 Exabyte : 1024 PB
Measurements and Estimations
Estimations
 4.7 GB A Single
DVD
 5 EB All words ever
spoken by mankind
How much data is
Big Data?
How much data is Big Data?
◎ Byte is a unit of measure of digital
information.
◎ Terabyte is the starting flag to participate
in the Big Data race.
How much data is Big Data?
Big Data Types
Big Data Types
◎ Structured Data
 indicate information which is well organized.
◎ Semi-Structured Data
 lacks the strict data model structure.
◎ Unstructured Data
 lacks structure in formation.
Sources of Data
Sources of Data
◎ Human Generated Data
◎ Machine Generated Data
◎ Social Graph Generated Data
Stages to Big Data
Stages to Big Data
◎ Data Acquisition
 Acquire the Data.
◎ Data Extraction
 All generated and acquired data is not of use.
◎ Data Collation
 Data from singular source often is not enough for
analysis or prediction.
 More than one data sources are combined to
give big picture to analyze.
Stages to Big Data
◎ Data Structuring
 Store acquired data in structure format.
◎ Data Visualization
 Data analysis involves targeting areas of interest
and providing result based on the structured
data.
◎ Data Interpretation
 The ultimate step as Data Processing.
Thanks!
Any questions?
You can find us in
CS Depart
References
◎ www.statistics.com/landing-page/data-science/data-science-quotes
◎ www.systems-thinking.org/dikw/dikw.htm
◎ https://info.varonis.com/applying-big-data-analytics-to-human-
generated-data
◎ https://www.purdue.edu/discoverypark/cyber/.../BigDataWhitePaper
.pdf
◎ http://community.hpe.com/t5/Business-Service-Management/A-
BIG-brother-for-your-BIG-data-environment/ba-p/6284087

More Related Content

What's hot

What's hot (19)

Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big data hype or reality
Big data   hype or realityBig data   hype or reality
Big data hype or reality
 
Ds01 data science
Ds01   data scienceDs01   data science
Ds01 data science
 
Big data
Big dataBig data
Big data
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
Big data
Big dataBig data
Big data
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using Hadoop
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Data juice
Data juiceData juice
Data juice
 
Big data presentation
Big data  presentationBig data  presentation
Big data presentation
 
Abn amro clearing - How Tableau enabled ABN AMRO to visualize Big Data effici...
Abn amro clearing - How Tableau enabled ABN AMRO to visualize Big Data effici...Abn amro clearing - How Tableau enabled ABN AMRO to visualize Big Data effici...
Abn amro clearing - How Tableau enabled ABN AMRO to visualize Big Data effici...
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Big data
Big dataBig data
Big data
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
The 10th ACC Audience survey report
The 10th ACC Audience survey reportThe 10th ACC Audience survey report
The 10th ACC Audience survey report
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 

Similar to Big data (Data Size doesn't Matter, How and What is Data that's matter)

Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Business Development Institute
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsWay-Yen Lin
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data scienceJohnson Ubah
 
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...BOBCATSSS 2017
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big dataAnushkaGupta763558
 
Bootstrap Big Data Webinar
Bootstrap Big Data WebinarBootstrap Big Data Webinar
Bootstrap Big Data WebinarJane Truch
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015Sanmeet Dhokay
 
Opportunities in Data Science.ppt
Opportunities in Data Science.pptOpportunities in Data Science.ppt
Opportunities in Data Science.pptSwapnilTelrandhe1
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!Dylan
 

Similar to Big data (Data Size doesn't Matter, How and What is Data that's matter) (20)

BIG DATA.pdf
BIG DATA.pdfBIG DATA.pdf
BIG DATA.pdf
 
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...
Kevin Röder, Mick van Galen and Suzanna Nieuwenkamp - The conflict between da...
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
Bootstrap Big Data Webinar
Bootstrap Big Data WebinarBootstrap Big Data Webinar
Bootstrap Big Data Webinar
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 
Opportunities in Data Science.ppt
Opportunities in Data Science.pptOpportunities in Data Science.ppt
Opportunities in Data Science.ppt
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
BDACA1516s2 - Lecture1
BDACA1516s2 - Lecture1BDACA1516s2 - Lecture1
BDACA1516s2 - Lecture1
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
 
BDACA - Lecture1
BDACA - Lecture1BDACA - Lecture1
BDACA - Lecture1
 

More from Syed Taimoor Hussain Shah (9)

Indoor propagation model (IPM)
Indoor propagation model (IPM)Indoor propagation model (IPM)
Indoor propagation model (IPM)
 
Equalization (Technique on Receiver Side to remove Interferences)
Equalization (Technique on Receiver Side to remove Interferences)Equalization (Technique on Receiver Side to remove Interferences)
Equalization (Technique on Receiver Side to remove Interferences)
 
Encryption
EncryptionEncryption
Encryption
 
Secure Electronic Transaction (SET)
Secure Electronic Transaction (SET)Secure Electronic Transaction (SET)
Secure Electronic Transaction (SET)
 
Liberary management system
Liberary management systemLiberary management system
Liberary management system
 
Computer advancement and History
Computer advancement and HistoryComputer advancement and History
Computer advancement and History
 
Effective cv writing
Effective cv writingEffective cv writing
Effective cv writing
 
Definitions of communication
Definitions of communicationDefinitions of communication
Definitions of communication
 
Vision and mission of companies
Vision and mission of companies Vision and mission of companies
Vision and mission of companies
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Big data (Data Size doesn't Matter, How and What is Data that's matter)

  • 2. Hello! I am Taimoor Hussain. Wajid Ali Daniyal Jan
  • 3. “ Data is the new science. Big Data holds the answers. Pat Gelsinger Processed data is information. Processed information is knowledge. Processed knowledge is Wisdom. Ankala V. Subbarao
  • 4. Content ◎ What is Big Data? ◎ Measurements and Estimations ◎ How much data is Big Data? ◎ Big Data Types ◎ Sources of Data ◎ Stages to Big Data ◎ References
  • 5. What is Big Data??
  • 6. Big Data Importance to data not the amount of data! Large volume of data  Structured  Un-Structured
  • 8. Measurements  Gigabyte : 1024 MB  Terabyte : 1024 GB  Petabyte : 1024 TB  Exabyte : 1024 PB Measurements and Estimations Estimations  4.7 GB A Single DVD  5 EB All words ever spoken by mankind
  • 9. How much data is Big Data?
  • 10. How much data is Big Data? ◎ Byte is a unit of measure of digital information. ◎ Terabyte is the starting flag to participate in the Big Data race.
  • 11. How much data is Big Data?
  • 12.
  • 14. Big Data Types ◎ Structured Data  indicate information which is well organized. ◎ Semi-Structured Data  lacks the strict data model structure. ◎ Unstructured Data  lacks structure in formation.
  • 16. Sources of Data ◎ Human Generated Data ◎ Machine Generated Data ◎ Social Graph Generated Data
  • 18. Stages to Big Data ◎ Data Acquisition  Acquire the Data. ◎ Data Extraction  All generated and acquired data is not of use. ◎ Data Collation  Data from singular source often is not enough for analysis or prediction.  More than one data sources are combined to give big picture to analyze.
  • 19. Stages to Big Data ◎ Data Structuring  Store acquired data in structure format. ◎ Data Visualization  Data analysis involves targeting areas of interest and providing result based on the structured data. ◎ Data Interpretation  The ultimate step as Data Processing.
  • 20. Thanks! Any questions? You can find us in CS Depart
  • 21. References ◎ www.statistics.com/landing-page/data-science/data-science-quotes ◎ www.systems-thinking.org/dikw/dikw.htm ◎ https://info.varonis.com/applying-big-data-analytics-to-human- generated-data ◎ https://www.purdue.edu/discoverypark/cyber/.../BigDataWhitePaper .pdf ◎ http://community.hpe.com/t5/Business-Service-Management/A- BIG-brother-for-your-BIG-data-environment/ba-p/6284087

Editor's Notes

  1. 40 Zettabytes = 43 Trillion Gigabytes 4.4 million it jobs are created in 2015 to support big data (1.9 million in US)
  2. In semi-structured data, the information that is contained within the data is normally associated with a database schema. This is why the information is sometimes called self-describing.
  3. The data mart is a subset of the data warehouse that is usually oriented to a specific business line or team. Data marts are small slices of the data warehouse. Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department. In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered as a core component of Business Intelligence environment. DWs are central repositories of integrated data from one or more disparate sources. A distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU,[1] and which is controlled by a distributed database management system (together sometimes called a distributed database system). It may be stored in multiple computers, located in the same physical location; or may be dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.
  4. MapReduce Framework MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. Google Dremel Dremel is a distributed system developed at Google for interactively querying large datasets and powers Google's BigQuery service[citation needed]. Dremel is the inspiration for BigQuery is a RESTful web service that enables interactive analysis of massively large datasets working in conjunction with Google Storage. It is an Infrastructure as a Service (IaaS) that may be used complementarily with MapReduce.