Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud Computing & Big Data


Published on

This presentation is about Cloud Computing & Big Data Analytics.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cloud Computing & Big Data

  1. 1. Mr. Kailash Shaw [ HOD ( CSE DEPT.) ] Mrinal Kumar - 1301292599 Pranav Kumar - 1301292603 1
  2. 2.  Introduction  Why Cloud Computing  Benefits of Cloud Computing  Characteristics  Advantages of Cloud Computing  Disadvantages of Cloud Computing  How Cloud Computing Works  Challenges of Cloud Computing  Layers of Cloud Computing  Components of Cloud Computing  Big Data  3 Vs of Big Data  Importance of Big Data  What Comes Under Big Data  Hadoop  Hadoop Architecture  Hadoop With Big Data  Map Reduce  Why Data Analytics  Types of Analysis  Types of Data Analytics  Big Data Analytics  Conclusion  References  Thanking You 2
  3. 3. Cloud computing is an internet based computer technology. It is the next stage technology that uses the clouds to provide the services whenever and wherever the user need it. It provides a method to access several servers world wide. What is Cloud? A cloud is a combination of networks, hardware, services, storage, and interfaces that helps in delivering computing as a service. What is Cloud Computing ? 3
  4. 4. Why Cloud Computing? Without Cloud Computing With Cloud Computing 4
  5. 5. Benefits of Cloud Computing  Cloud computing enables companies and applications, which are system infrastructure dependent, to be infrastructure-less.  By using the Cloud infrastructure on “pay as used and on demand”, all of us can save in capital and operational investment!  Clients can:-  Put their data on the platform instead of on their own desktop PCs and/or on their own servers.  They can put their applications on the cloud and use the servers within the cloud to do processing and data manipulations etc. 5
  6. 6. Agile Highly Reliable Independent of Device and Location Low Cost Pay-Per-Use Easy to Maintain Highly Scalable Multi-Shared 6
  7. 7. Advantages of Cloud Computing  Lower cost computer users  Lower IT infrastructure  Fewer Maintenance cost  Lower Software Cost  Instant Software updates  Increased Computing Powers  Unlimited storage capacity 7
  8. 8. Disadvantages of Cloud Computing  Requires a constant Internet connection  Stored data might not be secured  Limited control and flexibility  More risk on information leakage  Users cannot be aware of the network  Dependencies on service suppliers for implementing data management 8
  9. 9.  Use of cloud computing means dependence on others and that could possibly limit flexibility and innovation  Security could prove to be a big issue:  It is still unclear how safe out-sourced data is and when using these services  Ownership of data is not always clear.  Data Centre can become environmental hazards: Green Cloud  Cloud Interoperability is still an issue.
  10. 10. Layers of Cloud Computing  Infrastructure as a service (IaaS):-It provides cloud infrastructure in terms of hardware as like memory, processor, speed etc.  Platform as a service (PaaS):It provides cloud application platform for the developer.  Software as a service (SaaS)::It provides the cloud applications to users directly without installing anything on the system. These applications remains on cloud.
  11. 11. Components Of Cloud Computing
  12. 12. Big Data Big Data refers to a collection of data sets so large and complex. It is impossible to process them with the usual databases and tools because of its size and associated numbers. Big data is hard to capture, store, search, share, analyze and visualize.
  13. 13. 3 Vs of Big Data  The “BIG” in big data isn’t just about volume  Volume  Variety  Velocity
  14. 14. Importance of Big Data The importance of big data does not revolve around how much data you have , but what you do with it. You can take data from any source and analyze it to find answer that enables,  Cost reductions.  Time reductions.  New product development and optimized offerings .  Smart decision making.
  15. 15.  Black Box Data  Social Media Data  Stock Exchange Data  Power Grid Data  Transport Data  Search Engine Data  Structured data  Semi Structured data  Unstructured data
  16. 16. What is Hadoop ?  Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.  The software framework that supports HDFS, MapReduce and other related entities is called the project Hadoop or simply Hadoop.  This is open source and distributed by Apache.
  17. 17. Hadoop Ecosystem Apache Oozie (Workflow) Pig Latin Data Analysis Mahout Machine Learning HDFS (Hadoop Distributed File System) Map Reduce Framework Flume Sqoop Unstructured or Semi-Structured data Structured data Pig Latin Data Analysis Mahout Machine Learning H Base Hive DW System
  18. 18. With Big Data Hadoop is the core platform for structuring Big Data, and solves the problem of formatting it for subsequent analytics purposes. Hadoop uses a distributed computing architecture consisting of multiple servers using commodity hardware, making it relatively
  19. 19. Cost Effective System Large Cluster of Notes Parallel Processing Distributive Data Automatic failover management Data Locality optimization Heterogeneous Cluster Scalability
  20. 20. Map Reduce MapReduce is a programming model that Google has used successfully in processing its “big-data” sets (~ 20000 peta bytes per day)  A map function extracts some intelligence from raw data.  A reduce function aggregates according to some guides the data output by the map.  Users specify the computation in terms of a map and a reduce function,  Underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, and  Underlying system also handles machine failures, efficient communications, and performance issues.
  21. 21. Broken into pieces [ MAP ] Computation Computation Computation Computation Computation Computation Shuffle and Sort
  22. 22. Why Data Analysis? It is important to remember that the primary value from big data does not come from the data in its raw form but from the processing and analysis of it and the insights, products and services that emerge from analysis.
  23. 23. For unstructured data to be useful it must be analysed to extract and expose the information it contains Different types of analysis are possible, such as:-  Entity analysis – people, organisations, objects and events, and the relationships between them  Topic analysis – topics or themes, and their relative importance  Sentiment analysis – subjective view of a person to a particular topic  Feature analysis – Inherent characteristics that are significant for a particular analytical perspective (e.g. land coverage in satellite imagery) Types Of Analysis
  24. 24. Types Of Data Analytics Analytic Excellence leads to better decisions:-  Descriptive Analytics : What is happening?  Diagnostic Analytics : Why did it happen?  Predictive Analytics : What is likely going to happen?  Prescriptive Analytics : What should we do about it?
  25. 25. Analytics  Focus On :-  Predictive Analysis  Data Science  Data Sets:-  Large Scale Data Sets  More type of Data  Raw Data  Complex Data Models  Supports:-  Correlations – new insight more accurate answer
  26. 26.  Two IT initiatives are currently top of mind for organizations across the globe i.e.  Big Data Analytics  Cloud Computing  As a delivery model for IT services , cloud computing has the potential to enhance business agility and productivity while enabling greater efficiencies and reducing costs.  In the current scenario , Big Data is a big challenge for the organizations . To store and process such large volume of data , variety of data and velocity of data Hadoop came into existence.  Our presentation is all about Cloud Computing , Big Data & Big Data Analytics.
  27. 27.