This document summarizes a presentation about using Hadoop as an analytic platform. It discusses how Actian has added seven key ingredients to Hadoop to unlock its full potential for analytics. These include high-speed data integration, a visual framework for data science and modeling, open-source analytic operators, high-performance data processing engines, vector-based SQL processing natively on HDFS, an extremely fast parallel analytics engine, and a next-generation big data analytics platform. The goal is to transform Hadoop from merely a data reservoir to a fully-featured analytics platform.
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Hadoop Analytics Made Easy with Actian's Visual Platform
1. Grab some
coffee and
enjoy the
pre-show
banter
before the
top of the
hour!
2. Hadoop as an Analytic Platform: Why Not?
The Briefing Room
3. Twitter Tag: #briefr
The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
4. ! Reveal the essential characteristics of enterprise
software, good and bad
! Provide a forum for detailed analysis of today’s innovative
technologies
! Give vendors a chance to explain their product to savvy
analysts
! Allow audience members to pose serious questions... and
get answers!
Twitter Tag: #briefr
The Briefing Room
Mission
5. This Month: ANALYTIC PLATFORMS
November: DISCOVERY & VISUALIZATION
December: INNOVATORS
Twitter Tag: #briefr
The Briefing Room
Topics
2014 Editorial Calendar at
www.insideanalysis.com/webcasts/the-briefing-room
6. A NEW ERA of Architecture
Twitter Tag: #briefr
The Briefing Room
Executive Summary
Ø Don’t build
CARRIAGES for
highways
Ø Focus on NEW
opportunities
Ø SLOWLY ween off
old systems
7. Twitter Tag: #briefr
The Briefing Room
Analyst: William McKnight
William is President of McKnight Consulting Group. His
clients have included 17 of the Global 2000. Many
clients have gone public with their success story. His
team's implementations have won multiple Best
Practices awards. William is an Entrepreneur of the
Year Finalist, a frequent best practices judge and an
expert witness. He has hundreds of articles and dozens
of white papers in publication. William has also given
numerous keynote presentations worldwide at major
conferences and has given hundreds of public seminars
and webinars. William’s experience includes taking his
company to placement on the Inc. 500 and the Dallas
100 to seller of a multi-million dollar consulting firm.
He is a passionate communicator and motivator, and a
former IT VP of a Fortune 50 company.
8. Twitter Tag: #briefr
The Briefing Room
Actian
! Actian is a database and software development company
! The Actian Analytics Platform connects to data and Big Data
sources to perform actionable and advanced analytics
! Actian recently released Hadoop SQL Edition, a component
that enables SQL access on data stored in Hadoop
9. Twitter Tag: #briefr
The Briefing Room
Guest: Jim Hare
Jim Hare is Senior Director of Product Marketing
for the Actian Analytics Platform, helping
organizations transform big data into business
value. Prior to Actian, he was Director of
Marketing at IBM responsible for go-to-market
strategy and messaging for the big data
platform. Prior to joining IBM in 2008, Jim was
vice president of product marketing and
business development at Celequest, a
California-based operational business
intelligence vendor, which was acquired by
Cognos in 2007. He has over 16 years of deep
experience in enterprise software, business
intelligence, business process management, business activity monitoring, big data,
and automated software testing & monitoring. Jim holds a MS in Systems
Management from the University of Southern California, and an undergraduate
degree from the University of Colorado at Boulder.
32. Twitter Tag: #briefr
The Briefing Room
Perceptions & Questions
Analyst:
William McKnight
33. ANALYTICS: A BUSINESS
IMPERATIVE
Formed from SUMMARIES of data
Tied to Business Actions
Continual Re-evaluation
i.e., Customer Segmentation and Profit
Adding Big Data!
34. ANALYTICS EXAMPLES
Number of customers in each customer state (optionally by product or
multiple products)
Average balance of customers by geo
Average start date in each customer lifetime value decile by geo and device
New number of customers in each state
Propensity to churn by age band and device
Cost of acquisition by age and gender
Average session duration by cost of acquisition
Session duration differences between first and tenth session
Network with highest up time last month
Number of calls per session
Best performing ad network by day part in a geo, age band and device
And on and on and on and on….
36. SMARTER MARKETING
Spend + Media
Arbitrage
Opportunities +
Incremental Direct
Marketing Spend
Improvement:
Map Media Buys to the
Best Customer
Demographic
Do sponsorships align
with customer base?
Monitored transactions, renewals, customer care calls
Leveraged data to pitch right product, right time
Decrease in marketing cost
Increase in revenue, profit, customer satisfaction
37. VEHICLES FOR BIG DATA
Data Warehouse
Regional and
Departmental
Views
Applications
& Engines
Operational
Analytics &
Hot Views
Data Marts
ADS
Dependent
Independent
Relational
Data
Conformed
Dimensions
38. Last
Year
THE EVER-EXPANDING DATA WAREHOUSE
This
Year
Next
Year
• Enterprise Data Warehouse users
face huge annual upgrade
expenses
• To avoid this spend,
organizations are looking for
lower cost alternatives.
• Movement of data to tape not
desired, because data is offline
and not available for analytics
• Moving infrequently used data to
Hadoop is a cost-effective, online
option that preserves ability to
query
Cost
39. DATA WAREHOUSE EXPANSION
2
Offload data to
less expensive
Hadoop cluster to
save on data
management costs
As data
volume
increases
exponentially,
cost of
warehousing
rises also
Add operational data
for greater insight and
agility
in analytics and BI
4
1
Combine Hadoop data with
DW data for a more
comprehensive view of
history 3
HD
FS
HD
FS
HD
FS
40. QUESTIONS FOR ACTIAN
Where should analytics be created – in a relational environment or in Hadoop?
Where should they be analyzed? Do we have enough tools in a Hadoop
environment to do analysis there?
How do businesses analyze a combination of structured and unstructured
data?
Is it as simple as ‘structured data to the data warehouse or analytic one-offs
and unstructured data to Hadoop’?
Is using Hadoop as a data refinery the best use of Hadoop?
Does any data go to both environments? Or do just summaries get shared?
Can price/performance of a database vendor’s product be superior to an open
source product?