Trendwise Analytics
Copyright Trendwise Analytics
Bangalore Hadoop
Meetup – 6
3rd August 2013
Video
Trendwise Analytics
Agenda
Introduction to Big Data
Trendwise Analytics
Market opportunity
IDC, a research firm, predicts that
the market for Big Data technology and services...
Trendwise Analytics
Copyright Trendwise Analytics
Forecast growth of Hadoop Job Market
Source: Indeed -- http://www.indeed...
Trendwise Analytics
What is Big Data
Trendwise Analytics
Copyright Trendwise Analytics
Internet Minute ....
Trendwise Analytics
Copyright Trendwise Analytics
Other sources
•
A commercial aircraft
generates 3GB of
flight sensor dat...
Trendwise Analytics
How are Companies
using Big Data
Technology?
Trendwise Analytics
Copyright Trendwise Analytics
917 Aug 2013
Watson wins Jeopardy!
Feb 14th
2011 – Watson wins Jeopardy...
Trendwise Analytics
Across All Industries
Web app
optimization
Smart meter
monitoring
Equipment
monitoring
Advertising
ana...
Trendwise Analytics
Copyright Trendwise Analytics
What are the types of business problems?
Source: Cloudera “Ten Common Ha...
Trendwise Analytics
Companies using Big Data
12
 New $1B corporate center for software and
analytics
 Hiring 400 data sc...
Trendwise Analytics
Copyright Trendwise Analytics
TECHNOLOGY
Hadoop Non-Hadoop
Trendwise Analytics
What is Hadoop?
Open source project started by Doug Cutting
A platform to manage Big Data
Helps in ...
Trendwise Analytics
Hadoop is a set of Apache Frameworks and
more…
 Data storage (HDFS)

Runs on commodity hardware (usu...
Trendwise Analytics
What are the core parts of a Hadoop
distribution?
A View of Hadoop (from Hortonworks)
Source: “Intro to Map Reduce” -- http://www.youtube.com/watch?v=ht3dNvdNDzI
Hadoop Cluster HDFS (Physical)
Storage
Trendwise Analytics
MapReduce Job – Logical
View
Image from - http://mm-tom.s3.amazonaws.com/blog/MapReduce.png
Image from: http://blog.jteam.nl/wp-content/uploads/2009/08/MapReduceWordCountOverview1.png
MapReduce Example -
WordCount
Ways to MapReduce
Libraries Languages
Note: Java is most common, but other languages can be used
Hadoop Ecosystem
Trendwise Analytics
Copyright Trendwise Analytics
Other components
Hive
• Data Warehouse infrastructure
that provides data...
Trendwise Analytics
Benefits of Hadoop
• Hadoop is designed to run on cheap commodity
hardware
• It automatically handles ...
Trendwise Analytics
Copyright Trendwise Analytics
Commercial Hadoop Distributions
• Cloudera
• Hortonworks
• Greenplum, A ...
Trendwise Analytics
How to get started?
1. http://hadoop.apache.org/
2. Pre-requisite:
- Linux ( Preferred OS)
- Java JDK
...
Trendwise Analytics
How it looks?
Trendwise Analytics
Web Interface – NameNode Tracker
Trendwise Analytics
Job Tracker
Trendwise Analytics
Task Tracker
Trendwise Analytics
Technology – Non Hadoop
•HPCC - HPCC Systems from LexisNexis Risk Solutions offers a proven,
• open-so...
Trendwise Analytics
Contact us
S.Mohan Kumar
Co-Founder and CEO
Trendwise Analytics
Website: www.TrendwiseAnalytics.com
Em...
Upcoming SlideShare
Loading in...5
×

Hadoop,Big Data Analytics and More

1,744

Published on

A discussion on all things Big Data, in a single presentation.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,744
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
203
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Hadoop,Big Data Analytics and More

  1. 1. Trendwise Analytics Copyright Trendwise Analytics Bangalore Hadoop Meetup – 6 3rd August 2013 Video
  2. 2. Trendwise Analytics Agenda Introduction to Big Data
  3. 3. Trendwise Analytics Market opportunity IDC, a research firm, predicts that the market for Big Data technology and services will reach $16.9 billion by 2015, up from $3.2 billion in 2010. That is a 40 percent-a-year growth rate — about seven times the estimated growth rate for the overall information technology and communications business, according to IDC. Billions and billions: big data becomes a big deal : Deloitte predicts that in 2012, “big data” will likely experience accelerating growth and market penetration.
  4. 4. Trendwise Analytics Copyright Trendwise Analytics Forecast growth of Hadoop Job Market Source: Indeed -- http://www.indeed.com/jobtrends/Hadoop.html
  5. 5. Trendwise Analytics What is Big Data
  6. 6. Trendwise Analytics Copyright Trendwise Analytics Internet Minute ....
  7. 7. Trendwise Analytics Copyright Trendwise Analytics Other sources • A commercial aircraft generates 3GB of flight sensor data in 1 hour An ERP system for an mid size company grows by 1-2TB annually A Video Suveillance Camera generates 1- 3TB data in 3 months Airtel or Vodafone generates 3TB of Call Details Records (CDR) every day Every day 2.5 quintillion (2.5×10^18) bytes of data is created i.e., 2,500,000TB
  8. 8. Trendwise Analytics How are Companies using Big Data Technology?
  9. 9. Trendwise Analytics Copyright Trendwise Analytics 917 Aug 2013 Watson wins Jeopardy! Feb 14th 2011 – Watson wins Jeopardy! beating its human opponents. Watson is IBM’s super computer built using Big Data Technology.
  10. 10. Trendwise Analytics Across All Industries Web app optimization Smart meter monitoring Equipment monitoring Advertising analysis Life sciences research Fraud detection Healthcare outcomes Weather forecasting Natural resource exploration Social network analysis Churn analysis Traffic flow optimization IT infrastructure optimization Legal discovery
  11. 11. Trendwise Analytics Copyright Trendwise Analytics What are the types of business problems? Source: Cloudera “Ten Common Hadoopable Problems”
  12. 12. Trendwise Analytics Companies using Big Data 12  New $1B corporate center for software and analytics  Hiring 400 data scientists  Includes financial and marketing applications, but with special focus on industrial uses of big data  When will this gas turbine need maintenance? Ford collects and aggregates data from the 4 million vehicles that use in-car sensing and remote app management software The data allows to glean information on a range of issues, from how drivers are using their vehicles, to the driving environment that could help them improve the quality of the vehicle Partnered with Microsoft to develop SYNC Amazon has been collecting customer information for years--not just addresses and payment information but the identity of everything that a customer had ever bought or even looked at. They’re using that data to build customer relationship AT&T has 300 million customers A team of researchers is working to turn data collected through the company’s cellular network into a trove of information for policymakers, urban planners and traffic engineers. The researchers want to see how the city changes hourly by looking at calls and text messages relayed through cell towers around the region, noting that certain towers see more activity at different times
  13. 13. Trendwise Analytics Copyright Trendwise Analytics TECHNOLOGY Hadoop Non-Hadoop
  14. 14. Trendwise Analytics What is Hadoop? Open source project started by Doug Cutting A platform to manage Big Data Helps in Distributed computing Runs on Commodity Hardware
  15. 15. Trendwise Analytics Hadoop is a set of Apache Frameworks and more…  Data storage (HDFS)  Runs on commodity hardware (usually Linux)  Horizontally scalable  Processing (MapReduce)  Parallelized (scalable) processing  Fault Tolerant  Other Tools / Frameworks  Data Access  HBase, Hive, Pig, Mahout  Tools  Hue, Sqoop  Monitoring  Greenplum, Cloudera Hadoop Core - HDFS MapReduce API Data Access Tools & Libraries Monitoring & Alerting
  16. 16. Trendwise Analytics What are the core parts of a Hadoop distribution?
  17. 17. A View of Hadoop (from Hortonworks) Source: “Intro to Map Reduce” -- http://www.youtube.com/watch?v=ht3dNvdNDzI
  18. 18. Hadoop Cluster HDFS (Physical) Storage
  19. 19. Trendwise Analytics MapReduce Job – Logical View Image from - http://mm-tom.s3.amazonaws.com/blog/MapReduce.png
  20. 20. Image from: http://blog.jteam.nl/wp-content/uploads/2009/08/MapReduceWordCountOverview1.png MapReduce Example - WordCount
  21. 21. Ways to MapReduce Libraries Languages Note: Java is most common, but other languages can be used
  22. 22. Hadoop Ecosystem
  23. 23. Trendwise Analytics Copyright Trendwise Analytics Other components Hive • Data Warehouse infrastructure that provides data summarization and ad hoc querying on top of Hadoop PIG • A high-level data-flow language and execution framework for parallel computation Sqoop • Sqoop is a tool designed to help users of large data import existing relational databases into their Hadoop clusters Zookeeper • Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
  24. 24. Trendwise Analytics Benefits of Hadoop • Hadoop is designed to run on cheap commodity hardware • It automatically handles data replication and node failure • Handles large volumes of unstructured data easily • Last but not least – its free! ( Open source)
  25. 25. Trendwise Analytics Copyright Trendwise Analytics Commercial Hadoop Distributions • Cloudera • Hortonworks • Greenplum, A Division of EMC • IBM InfoSphere BigInsights
  26. 26. Trendwise Analytics How to get started? 1. http://hadoop.apache.org/ 2. Pre-requisite: - Linux ( Preferred OS) - Java JDK 3. Install and run a single node cluster 4. Follow the instructions on Mukul's Blog.
  27. 27. Trendwise Analytics How it looks?
  28. 28. Trendwise Analytics Web Interface – NameNode Tracker
  29. 29. Trendwise Analytics Job Tracker
  30. 30. Trendwise Analytics Task Tracker
  31. 31. Trendwise Analytics Technology – Non Hadoop •HPCC - HPCC Systems from LexisNexis Risk Solutions offers a proven, • open-source, data-intensive supercomputing platform • designed for the enterprise to solve big data problems. • SAP HANA is SAP AG’s implementation of in-memory database technology. •No-SQL Databases – Cassandra, CouchDB, Redis
  32. 32. Trendwise Analytics Contact us S.Mohan Kumar Co-Founder and CEO Trendwise Analytics Website: www.TrendwiseAnalytics.com Email: info@TrendwiseAnalytics.com US Tollfree: +1 877 268 2872 India Number: +91 80 4094 9600 Copyright Trendwise Analytics
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×