SlideShare a Scribd company logo
1 of 19
Download to read offline
BIG DATA
Prepared By-:
Raj Sharma(student)
Computer Science
3rd Sem,2nd Year
Subject-:pd
G.T. Polytechnic College
Jaora
Under the guidance of respected
Sir Rakesh mukati
Session 2021-22
OUTLINE
 INTRODUCTION
 BRIEF HISTROY BIG DATA
 BIG DATA DEFINATION
 BIG DATA ANALYTICS
 OPEN RESEARCH ISSUES
 CONCLUSIONS
 REFERENCES
INTRODUCTION
Big data is a collection of massive and complex data sets and data
volume that include the huge quantities of data, data management
capabilities, social media analytics and real-time data. Big data analytics
is the process of examining large amounts of data. There exist large
amounts of heterogeneous digital data.
BIG DATA – A BRIEF HISTORY
BIG DATA DEFINITION
 Big Data is a collection of data that is huge in volume, yet growing exponentially
with time. It is a data with so large size and complexity that none of traditional data
management tools can store it or process it efficiently. Big data is also a data but
with huge size.
BIG DATA DEFINITION (CONT…)
A well known definition of Big Data known as 3Vs
 Volume (Data is Huge)
 Velocity (Data is changing with time and coming with a velocity)
 Variety (Data is coming from multiple sources in multiple forms)
BIG DATA DEFINITION (CONT…)
The 3Vs definition was incomplete so following dimensions to the data are added in
definition:
 Veracity
 Validity
 Value
 Variability
 Venue
 Vocabulary and
 Vagueness
The data satisfying set of all these properties is known as Big Data.
TYPES OF BIG DATA
In the simplest terms, Big Data can be broken down into two basic types, structured and
unstructured data.
• Structured – Predefined data type
• Spreadsheets and Oracle Relational database
• Unstructured – is non pre-defined data model or is not organized in a pre-defined manner.
• Video, Audio, Images, Metadata, etc…
• Semi-structured – Structured data embedded with some unstructured data
• Email, Text Messaging
SOURCES OF BIG DATA
https://www.google.de/search?q=evolution+of+business+intelligence&newwindow=1&tbm=isch&tbo=u&source=univ&sa=X&ei=gEGoU5KXBuTb4QSGsoH4BQ&ved=
0CDsQsAQ&biw=1366&bih=64
APPLICATION DOMAINS OF BIG DATA
http://www.meltinfo.com/ppt/ibm-big-data
BIG DATA IN BUSINESS INTELLIGENCE
Every company relies on data, and this data is used to generate reports which will help companies
make educated decisions with the right information. When there is a multitude of data to access,
analyze and compile, if done manually, such a task might need a lot of resources and time. Business
intelligence (BI) helps simplify and automate the process as well as generate qualitative reports.
Business intelligence makes use of enterprise data to enhance strategic and operational decision-
making. As a process, BI involves the consolidation, analysis, and communication of business
information to assist companies in making the right business decision. And, as a technology, BI
consists of different tools that automate data consolidation, analysis, and the presentation of
business information to end-users. BI helps deliver relevant and reliable information to the right
people at the right time to get better answers even when the data is collated from multiple sources.
FIVE CHARACTERISTICS OF BIG DATA
• Big Data is defined by five characteristics:
Volume: Data created by and moving through today’s services may describe tens of millions of
customers, hundreds of millions of devices, and billions of transactions or statistical records. Such
scale requires careful engineering, as it is necessary to carefully conserve even the number of CPU
instructions or operating system events and network messages per data items. Parallel processing is
a powerful tool to cope with scale. MapReduce computing frameworks like Hadoop and storage
systems like HBASE and Cassandra provide low-cost, practical system foundations. Analysis also
requires efficient algorithms, because “data in flight” may only be observed one time, so conventional
storage-based approaches may not work. Large volumes of data may require a mix of “move the
data to the processing” and “move the processing to the data” architectural styles.
Velocity: Timeliness is often critical to the value of Big Data. For example, online customers may
expect promotions (coupons) received on a mobile device to reflect their current location, or they
may expect recommendations to reflect their most recent purchases or media that was accessed.
The business value of some data decays rapidly. Because raw data is often delivered in streams, or
in small batches in near real-time, the requirement to deliver rapid results can be demanding and
does not mesh well with conventional data warehouse technology.
FIVE CHARACTERISTICS OF BIG DATA (2)
 Variety: Big Data often means integrating and processing multiple types of data. We can consider most data
sources as structured, semi-structured, or unstructured. Structured data refers to records with fixed fields and
types. Unstructured data includes text, speech, and other multimedia. Semi-structured data may be a mixture of the
above, such as web documents, or sparse records with many variants, such as personal medical records with well
defined but complex types.
 Veracity: Data sources (even in the same domain) are of widely differing qualities, with significant differences in
the coverage, accuracy and timeliness of data provided. Per IBM's Big Data website, one in three business
leaders don't trust the information they use to make decisions. Establishing trust in big data presents a huge
challenge as the variety and number of sources grows..
 Variability: Beyond the immediate implications of having many types of data, the variety of data may also be
reflected in the frequency with which new data sources and types are introduced.
NOTE: Big Data generally includes data sets with sizes beyond the ability of commonly-used software tools to capture, manage, and process
the data within a tolerable elapsed time. Big data sizes are a constantly moving target, from hundreds of terabytes to many petabytes of data
in a single data set. With this difficulty, a new tool sets has arisen to handle making sense over these large quantities of data. Big data is
difficult to work with using relational databases, desktop statistics and visualization packages, requiring instead "massively parallel software
running on tens, hundreds, or even thousands of servers".
A BIG DATA PLATFORM MUST:
Analyze a variety of different types of information
This information by itself could be unrelated, but when paired with other information can
illustrate a causation for an various events that the business can take advantage.
Analyze information in motion
Various data types will be streaming and contain a large amount of "bursts". Ad hoc
analysis needs to be done on the streaming data to search for relevant events.
Cover extremely large volumes of data
Due to the proliferation of devices in the network, how they are used, along with customer
interactions on smartphones and the web, a cost efficient process to analyze the
petabytes of information is required
A BIG DATA PLATFORM MUST: (2)
 Cover varying types of data sources
Data can be streaming, batch, structured, unstructured, and semi-
structured, depending on the information type, where it comes from
and its primary use. Big Data must be able to accommodate all of
these various types of data on a very large scale.
 Analytics
Big Data must provide the mechanisms to allow ad-hoc queries,
data discovery and experimentation on the large data sets to
effectively correlate various events and data types to get an
understanding of the data that is useful and addresses business
needs.
REFRENCE-:google
THANK YOU

More Related Content

Similar to Big-Data-Analytics.8592259.powerpoint.pdf

A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Scienceijtsrd
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
 
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfPridesys IT Ltd.
 
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfPridesys IT Ltd.
 
Unit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfUnit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfRanjeet Bhalshankar
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond Rajesh Kumar
 
Big data seminor
Big data seminorBig data seminor
Big data seminorberasrujana
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET Journal
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion ahmed alshikh
 

Similar to Big-Data-Analytics.8592259.powerpoint.pdf (20)

Unit 1
Unit 1Unit 1
Unit 1
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
 
All About Big Data
All About Big Data All About Big Data
All About Big Data
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdf
 
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdf
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Unit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfUnit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdf
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Big data seminor
Big data seminorBig data seminor
Big data seminor
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Big data
Big dataBig data
Big data
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 

Recently uploaded

New 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck TemplateNew 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck TemplateCannaBusinessPlans
 
Cannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 UpdatedCannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 UpdatedCannaBusinessPlans
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAITim Wilson
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1kcpayne
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptxnandhinijagan9867
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfwill854175
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 MonthsIndeedSEO
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityEric T. Tung
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...NadhimTaha
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfAdmir Softic
 
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Adnet Communications
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon investment
 
Rice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsRice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsShree Krishna Exports
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 

Recently uploaded (20)

New 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck TemplateNew 2024 Cannabis Edibles Investor Pitch Deck Template
New 2024 Cannabis Edibles Investor Pitch Deck Template
 
Cannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 UpdatedCannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 Updated
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Buy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail AccountsBuy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail Accounts
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
HomeRoots Pitch Deck | Investor Insights | April 2024
HomeRoots Pitch Deck | Investor Insights | April 2024HomeRoots Pitch Deck | Investor Insights | April 2024
HomeRoots Pitch Deck | Investor Insights | April 2024
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdf
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsMifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
 
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business Growth
 
Rice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsRice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna Exports
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 

Big-Data-Analytics.8592259.powerpoint.pdf

  • 1. BIG DATA Prepared By-: Raj Sharma(student) Computer Science 3rd Sem,2nd Year Subject-:pd G.T. Polytechnic College Jaora Under the guidance of respected Sir Rakesh mukati Session 2021-22
  • 2. OUTLINE  INTRODUCTION  BRIEF HISTROY BIG DATA  BIG DATA DEFINATION  BIG DATA ANALYTICS  OPEN RESEARCH ISSUES  CONCLUSIONS  REFERENCES
  • 3. INTRODUCTION Big data is a collection of massive and complex data sets and data volume that include the huge quantities of data, data management capabilities, social media analytics and real-time data. Big data analytics is the process of examining large amounts of data. There exist large amounts of heterogeneous digital data.
  • 4. BIG DATA – A BRIEF HISTORY
  • 5. BIG DATA DEFINITION  Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data but with huge size.
  • 6. BIG DATA DEFINITION (CONT…) A well known definition of Big Data known as 3Vs  Volume (Data is Huge)  Velocity (Data is changing with time and coming with a velocity)  Variety (Data is coming from multiple sources in multiple forms)
  • 7. BIG DATA DEFINITION (CONT…) The 3Vs definition was incomplete so following dimensions to the data are added in definition:  Veracity  Validity  Value  Variability  Venue  Vocabulary and  Vagueness The data satisfying set of all these properties is known as Big Data.
  • 8. TYPES OF BIG DATA In the simplest terms, Big Data can be broken down into two basic types, structured and unstructured data. • Structured – Predefined data type • Spreadsheets and Oracle Relational database • Unstructured – is non pre-defined data model or is not organized in a pre-defined manner. • Video, Audio, Images, Metadata, etc… • Semi-structured – Structured data embedded with some unstructured data • Email, Text Messaging
  • 11.
  • 14. BIG DATA IN BUSINESS INTELLIGENCE Every company relies on data, and this data is used to generate reports which will help companies make educated decisions with the right information. When there is a multitude of data to access, analyze and compile, if done manually, such a task might need a lot of resources and time. Business intelligence (BI) helps simplify and automate the process as well as generate qualitative reports. Business intelligence makes use of enterprise data to enhance strategic and operational decision- making. As a process, BI involves the consolidation, analysis, and communication of business information to assist companies in making the right business decision. And, as a technology, BI consists of different tools that automate data consolidation, analysis, and the presentation of business information to end-users. BI helps deliver relevant and reliable information to the right people at the right time to get better answers even when the data is collated from multiple sources.
  • 15. FIVE CHARACTERISTICS OF BIG DATA • Big Data is defined by five characteristics: Volume: Data created by and moving through today’s services may describe tens of millions of customers, hundreds of millions of devices, and billions of transactions or statistical records. Such scale requires careful engineering, as it is necessary to carefully conserve even the number of CPU instructions or operating system events and network messages per data items. Parallel processing is a powerful tool to cope with scale. MapReduce computing frameworks like Hadoop and storage systems like HBASE and Cassandra provide low-cost, practical system foundations. Analysis also requires efficient algorithms, because “data in flight” may only be observed one time, so conventional storage-based approaches may not work. Large volumes of data may require a mix of “move the data to the processing” and “move the processing to the data” architectural styles. Velocity: Timeliness is often critical to the value of Big Data. For example, online customers may expect promotions (coupons) received on a mobile device to reflect their current location, or they may expect recommendations to reflect their most recent purchases or media that was accessed. The business value of some data decays rapidly. Because raw data is often delivered in streams, or in small batches in near real-time, the requirement to deliver rapid results can be demanding and does not mesh well with conventional data warehouse technology.
  • 16. FIVE CHARACTERISTICS OF BIG DATA (2)  Variety: Big Data often means integrating and processing multiple types of data. We can consider most data sources as structured, semi-structured, or unstructured. Structured data refers to records with fixed fields and types. Unstructured data includes text, speech, and other multimedia. Semi-structured data may be a mixture of the above, such as web documents, or sparse records with many variants, such as personal medical records with well defined but complex types.  Veracity: Data sources (even in the same domain) are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. Per IBM's Big Data website, one in three business leaders don't trust the information they use to make decisions. Establishing trust in big data presents a huge challenge as the variety and number of sources grows..  Variability: Beyond the immediate implications of having many types of data, the variety of data may also be reflected in the frequency with which new data sources and types are introduced. NOTE: Big Data generally includes data sets with sizes beyond the ability of commonly-used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, from hundreds of terabytes to many petabytes of data in a single data set. With this difficulty, a new tool sets has arisen to handle making sense over these large quantities of data. Big data is difficult to work with using relational databases, desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers".
  • 17. A BIG DATA PLATFORM MUST: Analyze a variety of different types of information This information by itself could be unrelated, but when paired with other information can illustrate a causation for an various events that the business can take advantage. Analyze information in motion Various data types will be streaming and contain a large amount of "bursts". Ad hoc analysis needs to be done on the streaming data to search for relevant events. Cover extremely large volumes of data Due to the proliferation of devices in the network, how they are used, along with customer interactions on smartphones and the web, a cost efficient process to analyze the petabytes of information is required
  • 18. A BIG DATA PLATFORM MUST: (2)  Cover varying types of data sources Data can be streaming, batch, structured, unstructured, and semi- structured, depending on the information type, where it comes from and its primary use. Big Data must be able to accommodate all of these various types of data on a very large scale.  Analytics Big Data must provide the mechanisms to allow ad-hoc queries, data discovery and experimentation on the large data sets to effectively correlate various events and data types to get an understanding of the data that is useful and addresses business needs.