Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Analytics Using Hadoop

Big data analytics is the process of examining large data sets containing a variety of data types i.e., big data to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits. Enterprises are increasingly looking to find actionable insights into their data. Many big data projects originate from the need to answer specific business questions. With the right big data analytics platforms in place, an enterprise can boost sales, increase efficiency, and improve operations, customer service and risk management. Notably, the business area getting the most attention relates to increasing efficiencies and optimizing operations. By using big data analytics you can extract only the relevant information from terabytes, petabytes and exabytes, and analyse it to transform your business decisions for the future. Becoming proactive with big data analytics isn't a one-time endeavour, it is more of a culture change – a new way of gaining ground.
Keywords: business, analytics, exabytes, efficiency, data sets

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

Big Data Analytics Using Hadoop

  1. 1. BIG DATA ANALYTICS USING HADOOP SUBMITTED BY V.N.V. SRIKANTH 138W1A12B4
  2. 2. ABSTRACT • Big data analytics is the process of examining large data sets containing a variety of data types i.e., big data to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. • The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits • Many big data projects originate from the need to answer specific business questions.With the right big data analytics platforms in place, an enterprise can boost sales, increase efficiency, and improve operations, customer service and risk management
  3. 3. MOTIVATION • By using big data analytics you can extract only the relevant information from terabytes, petabytes and exabytes, and analyze it to transform your business decisions for the future. • With the right big data analytics platforms in place, an enterprise can boost sales, increase efficiency, and improve operations, customer service and risk management. 0 2 4 6 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2 Series 3 • Technologies that includes Hadoop and related tools such as YARN, MapReduce, Spark, Hive and Pig as well as NoSQL databases supports the processing of large and diverse data sets across clustered systems
  4. 4. PROBLEM STATEMENT • The first challenge is in breaking down data silos to access all data an organization stores in different places and often in different systems. • A second big data challenge is in creating platforms that can pull in unstructured data as easily as structured data. • This massive volume of data is typically so large that it's difficult to process using traditional database and software methods.
  5. 5. PROBLEM STATEMENT(cont..) • The above challenges can be overcome by the implementation of following technologies Parallel Database Technologies Map Reduce • The best open source tools available are
  6. 6. 1996 1996 1997 1996 KNOWLEDGE FROM LITERATURE SURVEY
  7. 7. 1998 2013 KNOWLEDGE FROM LITERATURE SURVEY(CONT..)
  8. 8. • 2004- Initial versions of HDFS and MapReduce were implemented. • 2005-used GFS and MapReduce to perform operations. • 2006-Yahoo! created Hadoop based on GFS and MapReduce . • 2007 -Yahoo started using Hadoop on a 1000 node cluster. • 2008- Apache took over Hadoop,Tested a 4000 node cluster with it • 2009- successfully sorted a peta byte of data in less than 17 hours to handle billions of searches and indexing millions of web pages. • 2011 - Hadoop releases version 1.0 • 2013 -Version 2.0.6 is available KNOWLEDGE FROM LITERATURE SURVEY(CONT..)
  9. 9. 2003 2004 2006 LITERATURE SURVEY METHODS
  10. 10. Methods Author Year RDBMS (Relational Data Base Management Systems) E.F.CODD 1980 GRID COMPUTING IANFOSTER, CARL KESSELMAN (Early) 1990s Volunteer computing Luis F. G. Sarmenta 1996 hadoop HDFS Sanjay Ghemawat, Howard Gobioff, Shun- Tak Leung 2003 hadoop MapReduce Jefry Dean and Sanjay Ghemawat 2004 Apache Hadoop Doug Cutting & Mike Cafarella 2011 LITERATURE SURVEY METHODS(CONT..)
  11. 11. •Hardware Failure: As soon as we start using many pieces of hardware, the chance that one will fail is fairly high. • Combine the data after analysis: Most analysis tasks need to be able to combine the data in some way; data read from one disk may need to be combined with the data from any of the other 99 disks. DEMERITS OF PREVIOUS METHODS
  12. 12. Apache Hadoop is a framework for running applications on large cluster built of commodity hardware. A common way of avoiding data loss is through replication: redundant copies of the data are kept by the system so that in the event of failure, there is another copy available.The Hadoop Distributed Filesystem (HDFS), takes care of this problem. The second problem is solved by a simple programming model- Mapreduce. Hadoop is the popular open source implementation of MapReduce, a powerful tool designed for deep analysis and transformation of very large data sets. HADOOP ADVANTAGES
  13. 13. PROJECT IDEAS RELATEDTOTHETOPIC •TrafficCongestion Control •Hospital Management •College Management Systems
  14. 14. CONCLUSION By using big data analytics you can extract only the relevant information from terabytes, petabytes and exabytes, and analyze it to transform your business decisions for the future. With the right big data analytics platforms in place, an enterprise can boost sales, increase efficiency, and improve operations, customer service and risk management. Pros Cons Cost Effective Cluster management is hard Parallel processing Single point of failure Fault tolerance Security issues Scalability
  15. 15. REFERENCES  https://en.wikipedia.org/wiki/Big_data  http://searchbusinessanalytics.techtarget.com/definition/big-data- analytics  http://www.computerworld.com/article/2690856/big-data/8-big-trends-in- big-data-analytics.html  http://www.lunametrics.com/blog/2014/01/27/google-analytics-bigquery- whys-hows/  http://www.webopedia.com/TERM/B/big_data_analytics.html  http://www.sas.com/en_us/insights/analytics/big-data-analytics.html

×