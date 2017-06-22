UNIT : II Chracteristics of Data  Composition: deals with the structure of data i.e. sources of data, types of data, natu...
Evolution of Big Data  In 1970s : The data was essentially primitive and structured.  In 1980s and 1990s : Relational da...
Big Data Define Big Data?  It's anything beyond imagination.  Today's BIG may be tomorrow's NORMAL.  Terabytes, Petabyt...
 In 2001 industry analyst Doug Laney defines “Big Data” as the three V’s (3Vs): Volume, Velocity and Variety.  In 2012 G...
Challenges with Big Data Challenges with Big Data Capture Storage Curation Search Analysis Transfer Visualization Privacy
Characteristics of Big Data Big data is broken by three characteristics. Extremely largeVolume of data Extremely highVeloc...
Other characteristics of data which are not definitional for Big Data  Veracity and Validity : deals with abnormality, ac...
Why Big Data? More Data More Acurate Analysis More Confidence in decision making Impact in terms of enhancing operational ...
We are only Consumers or information producers? Consider one scenario :
1. Text msg. To attend the party. 2. use of credit/debit card at the petrol pump. 3. Point-of-sale sys. At Archie's shop. ...
BI Versus Big Data Bisiness Intelligence(BI) 1. All enterprise's data is housed in a central server 2. Tipical database se...
Typical Data Warehouse Environment ERP (Enterprise Resource Planning) CRM (Customer Relationship Management) Third party a...
Typical Hadoop Environment Web Logs Images and Videos Docs and PDFs Social Media HDFS Operational System Data Warehouse Da...
Functional Requirements of Big Data Big Data Big Data Big Data (1) Collection (2) Integration (3) Analysis (4) Actions Dec...
Big Data Stack  Big Data technical Stack explain layered architecture.  It is how to think about Big Data.  It is deali...
Big Data Stack Layer 0 Layer 1 Layer 2 Layer 3 Layer 4
Big Data Stack Layer 0 (Redundant Physical Infrastructure) : Deals with hardware, network & so on.  Performance: How resp...
Big Data Stack Layer 1 (Security Infrastructure) : Security and privacy requirements for big data are similar to the requi...
Big Data Stack Layer 2 (Operational Databases):  For Big Data environment it is needed to be have fast & scalable databas...
Big Data Stack Layer 3 (Organizing Data Services and Tools): Organizing Data Services and Tools capture, validate and asse...
Big Data Stack Layer 4 (Analytical data Warehouses):  Data Warehouse and Data Mart contain normalized data gathered from ...
Big Data Analytics: It requires proper Analytical tools This Architecture list three classes of tools.  Reporting and das...
Big Data Applications: Need to choose categories of applications.
Unit 2
Upcoming SlideShare
Loading in …5
×

Unit 2

38 views

Published on

Unit II

Published in: Education
0 Comments
0 Likes
Statistics
Notes
no profile picture user

  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
38
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Unit 2

  1. 1. UNIT : II Chracteristics of Data  Composition: deals with the structure of data i.e. sources of data, types of data, nature of data.  Condition: deals with state of data i.e.  Context: deals with generation of data, sensitivity of data.
  2. 2. Evolution of Big Data  In 1970s : The data was essentially primitive and structured.  In 1980s and 1990s : Relational databases evolved, so the era was of Data-intensive applications.  In 2000 and beyond : WWW and IoT have led to structured, unstructured and multimedia data.
  3. 3. Big Data Define Big Data?  It's anything beyond imagination.  Today's BIG may be tomorrow's NORMAL.  Terabytes, Petabytes or Zettabytes of data.  About 3V's.
  4. 4.  In 2001 industry analyst Doug Laney defines “Big Data” as the three V’s (3Vs): Volume, Velocity and Variety.  In 2012 Gartner update this definition as, “Big Data” is high-volume, high-velocity & high-variety information assets that demand cost- effective, innovative form of information processing for enhanced insight and decision making.  Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big Data
  5. 5. Challenges with Big Data Challenges with Big Data Capture Storage Curation Search Analysis Transfer Visualization Privacy
  6. 6. Characteristics of Big Data Big data is broken by three characteristics. Extremely largeVolume of data Extremely highVelocity of data Extremely wideVariety of data
  7. 7. Other characteristics of data which are not definitional for Big Data  Veracity and Validity : deals with abnormality, accuracy and correctness  Volatility : deals with data validity  Variability : deals with data floe which is highly inconsistent
  8. 8. Why Big Data? More Data More Acurate Analysis More Confidence in decision making Impact in terms of enhancing operational efficiency, reducing cost & time, innovating New products, new services, Optimized offerings etc.
  9. 9. We are only Consumers or information producers? Consider one scenario :
  10. 10. 1. Text msg. To attend the party. 2. use of credit/debit card at the petrol pump. 3. Point-of-sale sys. At Archie's shop. 4. Photographs & posts on social networking sites. 5. Likes & comments to your post.
  11. 11. BI Versus Big Data Bisiness Intelligence(BI) 1. All enterprise's data is housed in a central server 2. Tipical database server scales data Vertically 3. BI data analyzed in an offline mode 4. BI is about Structured Data 5. Move Data to code Big Data 1. Data resides in a distributed file system 2. Distributed file system scales data Horizontally 3. Big Data analyzed in both real time as well as offline mode. 4. Big Data is about veriety data 5. Move Code to data
  12. 12. Typical Data Warehouse Environment ERP (Enterprise Resource Planning) CRM (Customer Relationship Management) Third party apps Legacy System Data Warehouse Reporting/ Dashbording OLAP Ad hoc querying Modeling
  13. 13. Typical Hadoop Environment Web Logs Images and Videos Docs and PDFs Social Media HDFS Operational System Data Warehouse Data Mart ODS (Operational Data Store) Data MartHadoop MapReduce
  14. 14. Functional Requirements of Big Data Big Data Big Data Big Data (1) Collection (2) Integration (3) Analysis (4) Actions Decisions
  15. 15. Big Data Stack  Big Data technical Stack explain layered architecture.  It is how to think about Big Data.  It is dealing with – Storage – Analytics – Reporting – Applications  Let's watch this Vedio....
  16. 16. Big Data Stack Layer 0 Layer 1 Layer 2 Layer 3 Layer 4
  17. 17. Big Data Stack Layer 0 (Redundant Physical Infrastructure) : Deals with hardware, network & so on.  Performance: How responsive do you need the sys. To be? performance of your machine, very fast infrastructures tends to be very expensive.  Availability: Do you need a 100% uptime guarantee of servise? Highly available infrastuctures are very expensive.  Scalability: How Big does your infrastructure need to be? How much Disk space is needed?  Flexibility: How quickly can you add more resourses to the infrastructure?  Cost: What can you afford?
  18. 18. Big Data Stack Layer 1 (Security Infrastructure) : Security and privacy requirements for big data are similar to the requirements for conventional data environments.  Data Access: Data should be available to authorized person.  Application Access: Most API's offer protection from unauthorized usage or access.  Data Encryption: It is most challenging aspect in Big Data environment.  Threat Detection: The inclusion of mobile devices and social networks exponentially increases both the amount of data and opportunities for security threats.
  19. 19. Big Data Stack Layer 2 (Operational Databases):  For Big Data environment it is needed to be have fast & scalable database engine.  Use of RDBMS for Big Data is not practical solution.  Choose Proper Database.  Your Database must support ACID.
  20. 20. Big Data Stack Layer 3 (Organizing Data Services and Tools): Organizing Data Services and Tools capture, validate and assemble various big data elements in to contextually relevent collections. Becouse Big data is massive. Tools need to provide integration, translation, normalization and scale. Technologies in this layer are as follows:  A Distributed File System  Serialization Service  Coordination Services  Extract, Transfer and Load (ETL) Tools  Workflow Services
  21. 21. Big Data Stack Layer 4 (Analytical data Warehouses):  Data Warehouse and Data Mart contain normalized data gathered from a variety of sources and assembled to facilitate analysis of the business.  It is for creation of reports and visualization of disparate data items.
  22. 22. Big Data Analytics: It requires proper Analytical tools This Architecture list three classes of tools.  Reporting and dashboards: this tools provide “User-friendly” representation of information.  Visualization:  Analytics and Advanced Analytics:
  23. 23. Big Data Applications: Need to choose categories of applications.

×