In-Memory &
Hadoop:
Real-time
Big Data Intelligence

© 2013 Terracotta Inc. | Internal Use Only
Your speaker

Manish Devgan
Director of Product
Management
Terracotta

© 2013 Terracotta Inc.

2
What we’ll cover in this webcast
•

What’s Hadoop? (quick intro)

•

Hadoop’s weaknesses

•

Emerging best practices for combining
Hadoop and in-memory data management

•

Real-time intelligence example

•

Getting started with in-memory and Hadoop

•

Q&A

© 2013 Terracotta Inc.

3
What is Hadoop?

© 2013 Terracotta Inc.

4

© 2013 Terracotta Inc. | Internal Use Only

4
What is
•

?

Hadoop is open-source software data management framework
used to draw insights from data
Components
HDFS*: Scalable &
distributed Storage

Benefits
Scalable
• Efficiently store and process large
data sets

• Data distributed across cluster
nodes
• Name node keeps track of location

Reliable

MapReduce: Parallel
Processing of data

Rich & Flexible

• Splits a task for processing based
on data locality and then
assembles results
• Comprises of Map() procedure for
filtering & sorting and Reduce()
procedure for summarizing

• Get redundant storage, with failover
across cluster

• Complimentary set of tools &
frameworks
• Store data in any format

Economical
• Deploy on commodity hardware

*Hadoop Distributed File System
© 2013 Terracotta Inc.

5
What is
•

?

With Hadoop, you can ask interesting questions about your data
and get answers economically
Questions Hadoop can help answer
How can I target promotions to my customers for better
sales?
How risky are each of my customers?
Which advertisement should I show to optimize return?

How relevant is a result for a given search?
When will my machinery likely have a malfunction?

© 2013 Terracotta Inc.

6
Hadoop’s Weaknesses

© 2013 Terracotta Inc.

7

© 2013 Terracotta Inc. | Internal Use Only

7
Hadoop’s Weaknesses
•

No support for real-time insights

•

No support to facilitate interactive and exploratory data analysis

•

Challenging framework for computation beyond Map Reduce

•

Lacks tools for business analysts

© 2013 Terracotta Inc.

8
Emerging best practices
for combining Hadoop and
in-memory data management

© 2013 Terracotta Inc.

9

© 2013 Terracotta Inc. | Internal Use Only

9
Combining Hadoop and In-memory Data Management

-

Businesses are looking for ways to mine real-time insights to
provide competitive advantages

-

Increased adoption of transactional system data for analytics is
blurring the line between OLTP and OLAP

-

New frameworks and products are bringing in-memory
technologies to the Hadoop ecosystem

© 2013 Terracotta Inc.

10
Real-time Data Integration with Hadoop

Real-time Data Apps
Web
Apps

Mobile
Apps

Dashboards
& Mashups

Transactional
Apps

Data Feeds

Operational
Intelligence

In-memory Data Management Platform
Real-time
data
Real-time
Insights

Data Sources
Events

Log Data

POS Data

Social Media

Sensors

Images/Video
s

© 2013 Terracotta Inc.

11
Real-time intelligence example

© 2013 Terracotta Inc.

12

© 2013 Terracotta Inc. | Internal Use Only

12
BigMemory & Hadoop in financial services
Before: Custom ETL connector pushing batch data

BigMemory Store

Hadoop M/R

Short Term
Transaction
Data

Long Term
Transaction
Data

Rules &
Triggers

Credit
Reference
Data

Tagged
Accounts

Hadoop Cluster

HDFS to BigMemory
Processing
© 2013 Terracotta Inc.

13
BigMemory & Hadoop in financial services
Today: Streaming Data insights

BigMemory Store

BigMemoryHadoop
Connector
Insights

Short Term
Transaction
Data

Long Term
Transaction
Data

Rules &
Triggers

Credit
Reference
Data

Hadoop M/R

Tagged
Accounts

Hadoop Cluster
© 2013 Terracotta Inc.

14
Getting started with
in-memory and Hadoop

© 2013 Terracotta Inc.

15

© 2013 Terracotta Inc. | Internal Use Only

15
How to get started with In-memory and Hadoop?
•

If you already have a Hadoop project, look for use cases where
you want real-time access to insights

•

Start with a small-to-medium sized (20-40 nodes) cluster with a
well-defined use case requiring fast access to data

•

Consider exploratory use cases where you’re doing iterative
analysis on a data set to get answers faster

© 2013 Terracotta Inc.

16
In-Memory & Hadoop

Questions
Please type yours in the “Questions” panel or in the chat window.

© 2013 Terracotta Inc.

17
Connect with Terracotta
•

Download “BigMemory & Hadoop” white paper
− Visit:

•

Download “BigMemory-Hadoop Connector”
− Visit:

•

www.terracotta.org (Resources > White Papers)
www.terracotta.org/downloads/hadoop-connector

Contact Manish Devgan
− Email:

•

mdevgan@terracottatech.com

Follow us on Twitter
− @big_memory

•

Stay Tuned

© 2013 Terracotta Inc.

18

Terracotta Hadoop & In-Memory Webcast

  • 1.
    In-Memory & Hadoop: Real-time Big DataIntelligence © 2013 Terracotta Inc. | Internal Use Only
  • 2.
    Your speaker Manish Devgan Directorof Product Management Terracotta © 2013 Terracotta Inc. 2
  • 3.
    What we’ll coverin this webcast • What’s Hadoop? (quick intro) • Hadoop’s weaknesses • Emerging best practices for combining Hadoop and in-memory data management • Real-time intelligence example • Getting started with in-memory and Hadoop • Q&A © 2013 Terracotta Inc. 3
  • 4.
    What is Hadoop? ©2013 Terracotta Inc. 4 © 2013 Terracotta Inc. | Internal Use Only 4
  • 5.
    What is • ? Hadoop isopen-source software data management framework used to draw insights from data Components HDFS*: Scalable & distributed Storage Benefits Scalable • Efficiently store and process large data sets • Data distributed across cluster nodes • Name node keeps track of location Reliable MapReduce: Parallel Processing of data Rich & Flexible • Splits a task for processing based on data locality and then assembles results • Comprises of Map() procedure for filtering & sorting and Reduce() procedure for summarizing • Get redundant storage, with failover across cluster • Complimentary set of tools & frameworks • Store data in any format Economical • Deploy on commodity hardware *Hadoop Distributed File System © 2013 Terracotta Inc. 5
  • 6.
    What is • ? With Hadoop,you can ask interesting questions about your data and get answers economically Questions Hadoop can help answer How can I target promotions to my customers for better sales? How risky are each of my customers? Which advertisement should I show to optimize return? How relevant is a result for a given search? When will my machinery likely have a malfunction? © 2013 Terracotta Inc. 6
  • 7.
    Hadoop’s Weaknesses © 2013Terracotta Inc. 7 © 2013 Terracotta Inc. | Internal Use Only 7
  • 8.
    Hadoop’s Weaknesses • No supportfor real-time insights • No support to facilitate interactive and exploratory data analysis • Challenging framework for computation beyond Map Reduce • Lacks tools for business analysts © 2013 Terracotta Inc. 8
  • 9.
    Emerging best practices forcombining Hadoop and in-memory data management © 2013 Terracotta Inc. 9 © 2013 Terracotta Inc. | Internal Use Only 9
  • 10.
    Combining Hadoop andIn-memory Data Management - Businesses are looking for ways to mine real-time insights to provide competitive advantages - Increased adoption of transactional system data for analytics is blurring the line between OLTP and OLAP - New frameworks and products are bringing in-memory technologies to the Hadoop ecosystem © 2013 Terracotta Inc. 10
  • 11.
    Real-time Data Integrationwith Hadoop Real-time Data Apps Web Apps Mobile Apps Dashboards & Mashups Transactional Apps Data Feeds Operational Intelligence In-memory Data Management Platform Real-time data Real-time Insights Data Sources Events Log Data POS Data Social Media Sensors Images/Video s © 2013 Terracotta Inc. 11
  • 12.
    Real-time intelligence example ©2013 Terracotta Inc. 12 © 2013 Terracotta Inc. | Internal Use Only 12
  • 13.
    BigMemory & Hadoopin financial services Before: Custom ETL connector pushing batch data BigMemory Store Hadoop M/R Short Term Transaction Data Long Term Transaction Data Rules & Triggers Credit Reference Data Tagged Accounts Hadoop Cluster HDFS to BigMemory Processing © 2013 Terracotta Inc. 13
  • 14.
    BigMemory & Hadoopin financial services Today: Streaming Data insights BigMemory Store BigMemoryHadoop Connector Insights Short Term Transaction Data Long Term Transaction Data Rules & Triggers Credit Reference Data Hadoop M/R Tagged Accounts Hadoop Cluster © 2013 Terracotta Inc. 14
  • 15.
    Getting started with in-memoryand Hadoop © 2013 Terracotta Inc. 15 © 2013 Terracotta Inc. | Internal Use Only 15
  • 16.
    How to getstarted with In-memory and Hadoop? • If you already have a Hadoop project, look for use cases where you want real-time access to insights • Start with a small-to-medium sized (20-40 nodes) cluster with a well-defined use case requiring fast access to data • Consider exploratory use cases where you’re doing iterative analysis on a data set to get answers faster © 2013 Terracotta Inc. 16
  • 17.
    In-Memory & Hadoop Questions Pleasetype yours in the “Questions” panel or in the chat window. © 2013 Terracotta Inc. 17
  • 18.
    Connect with Terracotta • Download“BigMemory & Hadoop” white paper − Visit: • Download “BigMemory-Hadoop Connector” − Visit: • www.terracotta.org (Resources > White Papers) www.terracotta.org/downloads/hadoop-connector Contact Manish Devgan − Email: • mdevgan@terracottatech.com Follow us on Twitter − @big_memory • Stay Tuned © 2013 Terracotta Inc. 18