SlideShare a Scribd company logo
1 of 32
Download to read offline
Trendwise Analytics
Copyright Trendwise Analytics
Bangalore Hadoop
Meetup – 6
3rd August 2013
Video
Trendwise Analytics
Agenda
Introduction to Big Data
Trendwise Analytics
Market opportunity
IDC, a research firm, predicts that
the market for Big Data technology and services will reach $16.9 billion by 2015,
up from $3.2 billion in 2010. That is a 40 percent-a-year growth rate —
about seven times the estimated growth rate for the overall information
technology and communications business, according to IDC.
Billions and billions: big data
becomes a big deal :
Deloitte predicts that in 2012, “big
data” will likely experience
accelerating growth and market
penetration.
Trendwise Analytics
Copyright Trendwise Analytics
Forecast growth of Hadoop Job Market
Source: Indeed -- http://www.indeed.com/jobtrends/Hadoop.html
Trendwise Analytics
What is Big Data
Trendwise Analytics
Copyright Trendwise Analytics
Internet Minute ....
Trendwise Analytics
Copyright Trendwise Analytics
Other sources
•
A commercial aircraft
generates 3GB of
flight sensor data in 1
hour
An ERP system for an mid
size company grows by
1-2TB annually
A Video Suveillance
Camera generates 1-
3TB data in 3 months
Airtel or Vodafone
generates 3TB of
Call Details
Records (CDR)
every day
Every day 2.5 quintillion
(2.5×10^18) bytes of data
is created
i.e., 2,500,000TB
Trendwise Analytics
How are Companies
using Big Data
Technology?
Trendwise Analytics
Copyright Trendwise Analytics
917 Aug 2013
Watson wins Jeopardy!
Feb 14th
2011 – Watson wins Jeopardy!
beating its human opponents.
Watson is IBM’s super computer built
using Big Data Technology.
Trendwise Analytics
Across All Industries
Web app
optimization
Smart meter
monitoring
Equipment
monitoring
Advertising
analysis
Life sciences
research
Fraud
detection
Healthcare
outcomes
Weather
forecasting
Natural
resource
exploration
Social
network
analysis
Churn
analysis
Traffic flow
optimization
IT
infrastructure
optimization
Legal
discovery
Trendwise Analytics
Copyright Trendwise Analytics
What are the types of business problems?
Source: Cloudera “Ten Common Hadoopable Problems”
Trendwise Analytics
Companies using Big Data
12
 New $1B corporate center for software and
analytics
 Hiring 400 data scientists
 Includes financial and marketing applications,
but with special focus on industrial uses of big
data
 When will this gas turbine need
maintenance?
Ford collects and aggregates data from the 4
million vehicles that use in-car sensing and
remote app management software
The data allows to glean information on a range of
issues, from how drivers are using their vehicles, to
the driving environment that could help them
improve the quality of the vehicle
Partnered with Microsoft to develop SYNC
Amazon has been collecting customer
information for years--not just addresses and
payment information but the identity of
everything that a customer had ever bought or
even looked at.
They’re using that data to build customer
relationship
AT&T has 300 million customers
A team of researchers is working to turn data
collected through the company’s cellular network into
a trove of information for policymakers, urban
planners and traffic engineers.
The researchers want to see how the city changes
hourly by looking at calls and text messages relayed
through cell towers around the region, noting that
certain towers see more activity at different times
Trendwise Analytics
Copyright Trendwise Analytics
TECHNOLOGY
Hadoop Non-Hadoop
Trendwise Analytics
What is Hadoop?
Open source project started by Doug Cutting
A platform to manage Big Data
Helps in Distributed computing
Runs on Commodity Hardware
Trendwise Analytics
Hadoop is a set of Apache Frameworks and
more…
 Data storage (HDFS)

Runs on commodity hardware (usually Linux)

Horizontally scalable
 Processing (MapReduce)

Parallelized (scalable) processing

Fault Tolerant
 Other Tools / Frameworks

Data Access

HBase, Hive, Pig, Mahout

Tools

Hue, Sqoop

Monitoring

Greenplum, Cloudera
Hadoop Core - HDFS
MapReduce API
Data Access
Tools & Libraries
Monitoring & Alerting
Trendwise Analytics
What are the core parts of a Hadoop
distribution?
A View of Hadoop (from Hortonworks)
Source: “Intro to Map Reduce” -- http://www.youtube.com/watch?v=ht3dNvdNDzI
Hadoop Cluster HDFS (Physical)
Storage
Trendwise Analytics
MapReduce Job – Logical
View
Image from - http://mm-tom.s3.amazonaws.com/blog/MapReduce.png
Image from: http://blog.jteam.nl/wp-content/uploads/2009/08/MapReduceWordCountOverview1.png
MapReduce Example -
WordCount
Ways to MapReduce
Libraries Languages
Note: Java is most common, but other languages can be used
Hadoop Ecosystem
Trendwise Analytics
Copyright Trendwise Analytics
Other components
Hive
• Data Warehouse infrastructure
that provides data
summarization and ad hoc
querying on top of Hadoop
PIG
• A high-level data-flow language and
execution framework for parallel
computation
Sqoop
• Sqoop is a tool designed to
help users of large data
import existing relational
databases into their Hadoop
clusters
Zookeeper
• Zookeeper is a centralized
service for maintaining
configuration information,
naming, providing
distributed synchronization,
and providing group services
Trendwise Analytics
Benefits of Hadoop
• Hadoop is designed to run on cheap commodity
hardware
• It automatically handles data replication and node
failure
• Handles large volumes of unstructured data easily
• Last but not least – its free! ( Open source)
Trendwise Analytics
Copyright Trendwise Analytics
Commercial Hadoop Distributions
• Cloudera
• Hortonworks
• Greenplum, A Division of EMC
• IBM InfoSphere BigInsights
Trendwise Analytics
How to get started?
1. http://hadoop.apache.org/
2. Pre-requisite:
- Linux ( Preferred OS)
- Java JDK
3. Install and run a single node cluster
4. Follow the instructions on Mukul's Blog.
Trendwise Analytics
How it looks?
Trendwise Analytics
Web Interface – NameNode Tracker
Trendwise Analytics
Job Tracker
Trendwise Analytics
Task Tracker
Trendwise Analytics
Technology – Non Hadoop
•HPCC - HPCC Systems from LexisNexis Risk Solutions offers a proven,
• open-source, data-intensive supercomputing platform
• designed for the enterprise to solve big data problems.
• SAP HANA is SAP AG’s implementation of in-memory database technology.
•No-SQL Databases – Cassandra, CouchDB, Redis
Trendwise Analytics
Contact us
S.Mohan Kumar
Co-Founder and CEO
Trendwise Analytics
Website: www.TrendwiseAnalytics.com
Email: info@TrendwiseAnalytics.com
US Tollfree: +1 877 268 2872
India Number: +91 80 4094 9600
Copyright Trendwise Analytics

More Related Content

What's hot

Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer
 
Whatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopWhatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopEdureka!
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Big Data Final Presentation
Big Data Final PresentationBig Data Final Presentation
Big Data Final Presentation17aroumougamh
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopAmir Shaikh
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopFebiyan Rachman
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time ApplicationsDataWorks Summit
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizITJobZone.biz
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop BasicsSonal Tiwari
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry PerspectiveCloudera, Inc.
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core conceptsMaryan Faryna
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big dataYukti Kaura
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopAvkash Chauhan
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 

What's hot (20)

Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Whatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopWhatisbigdataandwhylearnhadoop
Whatisbigdataandwhylearnhadoop
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big Data Final Presentation
Big Data Final PresentationBig Data Final Presentation
Big Data Final Presentation
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time Applications
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core concepts
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Big Data Concepts
Big Data ConceptsBig Data Concepts
Big Data Concepts
 

Viewers also liked

Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoopAmbuj Kumar
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applicationspanoratio
 
remixedworkplace
remixedworkplaceremixedworkplace
remixedworkplacenutmegslim
 
Kpi for purchasing department
Kpi for purchasing departmentKpi for purchasing department
Kpi for purchasing departmentdubihugeharris
 
Ncr #2608607 V1 Labour Jurisdictions 2
Ncr #2608607 V1 Labour Jurisdictions 2Ncr #2608607 V1 Labour Jurisdictions 2
Ncr #2608607 V1 Labour Jurisdictions 2helenroos
 
May 13 articles and various reports
May 13 articles and various reportsMay 13 articles and various reports
May 13 articles and various reportsKamarulzaman Darus
 
Proton performance over the years
Proton performance over the yearsProton performance over the years
Proton performance over the yearsKamarulzaman Darus
 
Hr Strategic Plan Dec 5 2009
Hr Strategic Plan   Dec 5 2009Hr Strategic Plan   Dec 5 2009
Hr Strategic Plan Dec 5 2009helenroos
 
Financial controller kpi
Financial controller kpiFinancial controller kpi
Financial controller kpiporebunhas
 
Critical thinking & communication skills rev5.24.10
Critical thinking & communication skills rev5.24.10Critical thinking & communication skills rev5.24.10
Critical thinking & communication skills rev5.24.10nutmegslim
 

Viewers also liked (20)

Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoop
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
 
Kpi forms
Kpi formsKpi forms
Kpi forms
 
Finance for non
Finance for nonFinance for non
Finance for non
 
remixedworkplace
remixedworkplaceremixedworkplace
remixedworkplace
 
10 years' transformation
10 years' transformation10 years' transformation
10 years' transformation
 
Kpi for purchasing department
Kpi for purchasing departmentKpi for purchasing department
Kpi for purchasing department
 
Which one is anwar ibrahim
Which one is anwar ibrahimWhich one is anwar ibrahim
Which one is anwar ibrahim
 
Kpi for it manager
Kpi for it managerKpi for it manager
Kpi for it manager
 
Ncr #2608607 V1 Labour Jurisdictions 2
Ncr #2608607 V1 Labour Jurisdictions 2Ncr #2608607 V1 Labour Jurisdictions 2
Ncr #2608607 V1 Labour Jurisdictions 2
 
100 memorable malaysian women
100 memorable malaysian women100 memorable malaysian women
100 memorable malaysian women
 
Kpi for finance
Kpi for financeKpi for finance
Kpi for finance
 
May 13 articles and various reports
May 13 articles and various reportsMay 13 articles and various reports
May 13 articles and various reports
 
Proton performance over the years
Proton performance over the yearsProton performance over the years
Proton performance over the years
 
Hr Strategic Plan Dec 5 2009
Hr Strategic Plan   Dec 5 2009Hr Strategic Plan   Dec 5 2009
Hr Strategic Plan Dec 5 2009
 
Financial controller kpi
Financial controller kpiFinancial controller kpi
Financial controller kpi
 
Kpi for procurement
Kpi for procurementKpi for procurement
Kpi for procurement
 
Critical thinking & communication skills rev5.24.10
Critical thinking & communication skills rev5.24.10Critical thinking & communication skills rev5.24.10
Critical thinking & communication skills rev5.24.10
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 

Similar to Hadoop,Big Data Analytics and More

Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companiesRobert Smith
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionMongoDB
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsSkillspeed
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of DataDaniel Saito
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise DataWorks Summit
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data BSP Media Group
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Stuart Blair
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationpzjnjr6rsg
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBigDataExpo
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2Joe_F
 
Maximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs EditionMaximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs EditionSafe Software
 
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesData Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesMultiscope
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Smart Data Module 6 d drive the future
Smart Data Module 6 d drive the futureSmart Data Module 6 d drive the future
Smart Data Module 6 d drive the futurecaniceconsulting
 

Similar to Hadoop,Big Data Analytics and More (20)

Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
 
Big data
Big dataBig data
Big data
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of Analytics
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
Maximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs EditionMaximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs Edition
 
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesData Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Smart Data Module 6 d drive the future
Smart Data Module 6 d drive the futureSmart Data Module 6 d drive the future
Smart Data Module 6 d drive the future
 

Recently uploaded

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Recently uploaded (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

Hadoop,Big Data Analytics and More

  • 1. Trendwise Analytics Copyright Trendwise Analytics Bangalore Hadoop Meetup – 6 3rd August 2013 Video
  • 3. Trendwise Analytics Market opportunity IDC, a research firm, predicts that the market for Big Data technology and services will reach $16.9 billion by 2015, up from $3.2 billion in 2010. That is a 40 percent-a-year growth rate — about seven times the estimated growth rate for the overall information technology and communications business, according to IDC. Billions and billions: big data becomes a big deal : Deloitte predicts that in 2012, “big data” will likely experience accelerating growth and market penetration.
  • 4. Trendwise Analytics Copyright Trendwise Analytics Forecast growth of Hadoop Job Market Source: Indeed -- http://www.indeed.com/jobtrends/Hadoop.html
  • 6. Trendwise Analytics Copyright Trendwise Analytics Internet Minute ....
  • 7. Trendwise Analytics Copyright Trendwise Analytics Other sources • A commercial aircraft generates 3GB of flight sensor data in 1 hour An ERP system for an mid size company grows by 1-2TB annually A Video Suveillance Camera generates 1- 3TB data in 3 months Airtel or Vodafone generates 3TB of Call Details Records (CDR) every day Every day 2.5 quintillion (2.5×10^18) bytes of data is created i.e., 2,500,000TB
  • 8. Trendwise Analytics How are Companies using Big Data Technology?
  • 9. Trendwise Analytics Copyright Trendwise Analytics 917 Aug 2013 Watson wins Jeopardy! Feb 14th 2011 – Watson wins Jeopardy! beating its human opponents. Watson is IBM’s super computer built using Big Data Technology.
  • 10. Trendwise Analytics Across All Industries Web app optimization Smart meter monitoring Equipment monitoring Advertising analysis Life sciences research Fraud detection Healthcare outcomes Weather forecasting Natural resource exploration Social network analysis Churn analysis Traffic flow optimization IT infrastructure optimization Legal discovery
  • 11. Trendwise Analytics Copyright Trendwise Analytics What are the types of business problems? Source: Cloudera “Ten Common Hadoopable Problems”
  • 12. Trendwise Analytics Companies using Big Data 12  New $1B corporate center for software and analytics  Hiring 400 data scientists  Includes financial and marketing applications, but with special focus on industrial uses of big data  When will this gas turbine need maintenance? Ford collects and aggregates data from the 4 million vehicles that use in-car sensing and remote app management software The data allows to glean information on a range of issues, from how drivers are using their vehicles, to the driving environment that could help them improve the quality of the vehicle Partnered with Microsoft to develop SYNC Amazon has been collecting customer information for years--not just addresses and payment information but the identity of everything that a customer had ever bought or even looked at. They’re using that data to build customer relationship AT&T has 300 million customers A team of researchers is working to turn data collected through the company’s cellular network into a trove of information for policymakers, urban planners and traffic engineers. The researchers want to see how the city changes hourly by looking at calls and text messages relayed through cell towers around the region, noting that certain towers see more activity at different times
  • 13. Trendwise Analytics Copyright Trendwise Analytics TECHNOLOGY Hadoop Non-Hadoop
  • 14. Trendwise Analytics What is Hadoop? Open source project started by Doug Cutting A platform to manage Big Data Helps in Distributed computing Runs on Commodity Hardware
  • 15. Trendwise Analytics Hadoop is a set of Apache Frameworks and more…  Data storage (HDFS)  Runs on commodity hardware (usually Linux)  Horizontally scalable  Processing (MapReduce)  Parallelized (scalable) processing  Fault Tolerant  Other Tools / Frameworks  Data Access  HBase, Hive, Pig, Mahout  Tools  Hue, Sqoop  Monitoring  Greenplum, Cloudera Hadoop Core - HDFS MapReduce API Data Access Tools & Libraries Monitoring & Alerting
  • 16. Trendwise Analytics What are the core parts of a Hadoop distribution?
  • 17. A View of Hadoop (from Hortonworks) Source: “Intro to Map Reduce” -- http://www.youtube.com/watch?v=ht3dNvdNDzI
  • 18. Hadoop Cluster HDFS (Physical) Storage
  • 19. Trendwise Analytics MapReduce Job – Logical View Image from - http://mm-tom.s3.amazonaws.com/blog/MapReduce.png
  • 21. Ways to MapReduce Libraries Languages Note: Java is most common, but other languages can be used
  • 23. Trendwise Analytics Copyright Trendwise Analytics Other components Hive • Data Warehouse infrastructure that provides data summarization and ad hoc querying on top of Hadoop PIG • A high-level data-flow language and execution framework for parallel computation Sqoop • Sqoop is a tool designed to help users of large data import existing relational databases into their Hadoop clusters Zookeeper • Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
  • 24. Trendwise Analytics Benefits of Hadoop • Hadoop is designed to run on cheap commodity hardware • It automatically handles data replication and node failure • Handles large volumes of unstructured data easily • Last but not least – its free! ( Open source)
  • 25. Trendwise Analytics Copyright Trendwise Analytics Commercial Hadoop Distributions • Cloudera • Hortonworks • Greenplum, A Division of EMC • IBM InfoSphere BigInsights
  • 26. Trendwise Analytics How to get started? 1. http://hadoop.apache.org/ 2. Pre-requisite: - Linux ( Preferred OS) - Java JDK 3. Install and run a single node cluster 4. Follow the instructions on Mukul's Blog.
  • 28. Trendwise Analytics Web Interface – NameNode Tracker
  • 31. Trendwise Analytics Technology – Non Hadoop •HPCC - HPCC Systems from LexisNexis Risk Solutions offers a proven, • open-source, data-intensive supercomputing platform • designed for the enterprise to solve big data problems. • SAP HANA is SAP AG’s implementation of in-memory database technology. •No-SQL Databases – Cassandra, CouchDB, Redis
  • 32. Trendwise Analytics Contact us S.Mohan Kumar Co-Founder and CEO Trendwise Analytics Website: www.TrendwiseAnalytics.com Email: info@TrendwiseAnalytics.com US Tollfree: +1 877 268 2872 India Number: +91 80 4094 9600 Copyright Trendwise Analytics