W

I

N

T

E

R

C

O

R

P

TCOD: A Framework for the
Total Cost of Big Data
(key charts)
Research Report: wintercorp.co...
Total Cost of Data
(TCOD)

Software Development/Maintenance

Analytics

Queries

Apps

Admin

ETL*

System

Diagram not to...
Data Refining Example
Data from Turbines

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
© 2010, 2011, 2012 WINTER CO...
Data Refining Example
Data Management Requirements

1. Hundreds of TB of data per week – 500 TB data
capacity
2. Raw data ...
Cost Comparison
Engineering Example – Data Refining

On Hadoop
On Data Warehouse Appliance*

$9.3m

$30m

(not to scale)

...
Observations on Hadoop
1. Many examples of the data refining requirement in
engineering, operations, business, science, he...
Business Example
Enterprise Data Warehouse

WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS
©2010 2013 WINTER CORPORAT...
Business Example - EDW
Data Management Requirements
1.

2.

3.

4.
5.
6.
7.
8.
9.

Data volume
a. 500 TB to start – all re...
Cost Comparison
Business Example – EDW
Total System Cost
System and Data Admin
ETL
Application Development
Complex Queries...
$35

Millions

Millions

Conclusions – Two TCOD Examples
$30

$25

$800
$700
$600

Total System Cost

$20

$500

System an...
TCOD Framework
Additional Notes
Not taken into account
 Actual system workloads, concurrency, availability reqmts.
 Cost...
In Conclusion
1. TCOD estimates what your company will really spend to get
to your business goal.
2. Total cost is extreme...
Upcoming SlideShare
Loading in …5
×

Tcod a framework for the total cost of big data - december 6 2013 - winter corp - v17

856 views

Published on

Big Data: What Does it Really Cost?
The WinterCorp Real Cost of Big Data research compares the total cost of an analytic data solution on Hadoop and on a data warehouse. Learn about:
- The major cost components of an analytic big data project and how they are estimated in the total cost of data (TCOD) framework
- Why it is critical to consider total project cost, not just platform cost
- How the costs differ on a Hadoop platform and a data warehouse platform
- An example where the Hadoop platform is more cost effective
- An example where the data warehouse platform is more cost effective
Why you need both Hadoop and data warehouse platforms in your analytic data architecture.

Key charts are posted here. Full report is available at www.wintercorp.com/tcod-report

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
856
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Tcod a framework for the total cost of big data - december 6 2013 - winter corp - v17

  1. 1. W I N T E R C O R P TCOD: A Framework for the Total Cost of Big Data (key charts) Research Report: wintercorp.com/tcod-report Spreadsheet: wintercorp.com/tcod-spreadsheet Key Charts: wintercorp.com/tcod-charts Richard Winter WinterCorp December 6, 2013, V17 THE LARGE SCALE DATA MANAGEMENT EXPERTS
  2. 2. Total Cost of Data (TCOD) Software Development/Maintenance Analytics Queries Apps Admin ETL* System Diagram not to scale. TCOD is the cost of storing, managing and using data over time for analytic purposes * ETL is extract, transform and load (preparing data for analytic use) WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 2
  3. 3. Data Refining Example Data from Turbines WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 3
  4. 4. Data Refining Example Data Management Requirements 1. Hundreds of TB of data per week – 500 TB data capacity 2. Raw data life: few hours to a few days 3. Challenge: find the important events or trends quickly 4. Massive analysis problem 5. When analyzing, read entire files 6. Keep only the significant data WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTER CORPORATION, CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter Corporation. All Rights Reserved. 4
  5. 5. Cost Comparison Engineering Example – Data Refining On Hadoop On Data Warehouse Appliance* $9.3m $30m (not to scale) Data Warehouse Appliance Volume of Data 500 TB 500 TB System Cost $23 million $1.3 million Total Cost of Data * Performance class of DW Appliance – not the lowest price class Hadoop $30 million $9.3 million WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 5
  6. 6. Observations on Hadoop 1. Many examples of the data refining requirement in engineering, operations, business, science, healthcare 2. Cost equation is favorable to Hadoop in these applications even with a wide variety of data types 3. There are also many other excellent Hadoop use cases – Data landing zone – Archive – Intensive batch processing of data 4. Example is one illustration of Hadoop sweet spot WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter CORPORATION, All Rights Reserved. 6
  7. 7. Business Example Enterprise Data Warehouse WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED. 7
  8. 8. Business Example - EDW Data Management Requirements 1. 2. 3. 4. 5. 6. 7. 8. 9. Data volume a. 500 TB to start – all retained for at least five years b. Continual growth of data and workload Data sources: thousands a. Data sources change their feeds frequently b. New data sources are frequent Challenges a. Data must be correct b. Data must be integrated Typical enterprise data lifetime: decades Analytic application lifetime: years Many thousands of data users (104 – 106) Hundreds of analytic applications Thousands of one time analyses Tens of thousands of complex queries WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2012, 2013 WINTERCorporation.CAMBRIDGE MA. ALL RIGHTS RESERVED. ©2010 Winter CORPORATION, All Rights Reserved. 8
  9. 9. Cost Comparison Business Example – EDW Total System Cost System and Data Admin ETL Application Development Complex Queries Analysis On EDW Platform On Hadoop $265 million $740 million (not to scale) Data Warehouse Platform Volume of Data Hadoop 500 TB System Cost $45 million Total Cost of Data $1.4 million $265 million $740 million WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 9
  10. 10. $35 Millions Millions Conclusions – Two TCOD Examples $30 $25 $800 $700 $600 Total System Cost $20 $500 System and Data Admin $15 $400 Application Development $10 $300 ETL $5 $200 Complex Queries $0 $100 On Hadoop On Data Warehouse Analysis $0 On Hadoop Data Refining: Hadoop wins Also: Landing Zone, Archive On Data Warehouse EDW: Data W/H Platform Wins 1. TCOD is NOT platform cost 2. Each technology has large advantages in its sweet spot(s) 3. Neither platform is cost effective in the other’s sweet spot 4. Biggest differences for the data warehouse are the development of:  Complex queries  Analytics WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 2013 WINTER CORPORATION, CAMBRIDGEReserved. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED. 10
  11. 11. TCOD Framework Additional Notes Not taken into account  Actual system workloads, concurrency, availability reqmts.  Cost of preparing simple queries  Cost of query execution  Workload management  Vendor supported distributions of Hadoop/Hadoop Appliances  ETL products available with Hadoop New Products Should Eventually Decrease TCOD with Hadoop  Cloudera Impala, IBM BigSQL, Teradata SQL-H, EMC Pivotal  New version of Hive supports subset of SQL  Further analysis, evaluation and measurement is required WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS ©2010 Winter Corporation. All Rights Reserved. 11
  12. 12. In Conclusion 1. TCOD estimates what your company will really spend to get to your business goal. 2. Total cost is extremely sensitive to technology choice 3. Analytic architectures will require both Hadoop and data warehouse platforms 4. Focus on total cost, not platform cost, in making your choice for a particular application or use. 5. Many analytic processes will use both Hadoop and data warehouse technology – so integration counts! Questions and comments welcome at tcod@wintercorp.com WINTERCORP: THE LARGE SCALE DATA MANAGEMENT EXPERTS © 2010, 2011, 2012 WINTER CORPORATION,, CAMBRIDGE Reserved. ©2010 2013 WINTER CORPORATION CAMBRIDGE MA. ALL RIGHTS RESERVED.. © 2012, Winter Corporation. All Rights MA. ALL RIGHTS RESERVED 12

×