Impetus Technologies Inc. 
Does Big Data Spell Big Costs? 
1 © 2014 Impetus Technologies 
Recorded version available at 
h...
Outline 
• Big Data – Current Scenario 
• Cost components in a Big Data Warehouse 
• Best Practices - Reducing the cost of...
Big Data – Current Scenario 
2.5 Quintillion Bytes produced every day 
$6 Trillion Big data cost IDC/EMC 
$650 Billion 
3 ...
Age of Data 
Age of Software Age of Data 
4 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus...
Existing State of Big Data 
5 © 2014 Impetus Technologies 
Solution
Using Commodity H/w for Big Data 
Commodity Hardware 
• Pros 
– Build your own 
– The promise of innovation 
• Cons 
– Bui...
Using Open Source & Cloud Computing 
Open Source 
– Pros 
• Software is free !! Glory to the Elephant 
– Cons 
• Cost of T...
Big Data Warehouse- Cost Components 
• Initial entry costs- Cost of experimentation 
• Cost of integration and moving data...
Lowering TCO of Big Data 
• Hardware 
– Lower cost of storage 
– Lower cost of computation 
• Software 
– Make things fast...
How to reduce the cost of storage? 
– Compress – RainStor and similar solutions 
• Just make sure your ‘Read Throughput’ i...
Technologies: What and Where? 
What? 
• Open Source vs. Commercial software? 
• Specialized hardware/appliances vs. commod...
OLAP: Big Data Scenarios 
12 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus.com/webinar_re...
Data Tapping Point, Cost & Latency 
13 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus.com/...
Indirect Analytics over Hadoop 
14 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus.com/webi...
Direct Analytics over Hadoop 
15 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus.com/webina...
Analytics over Hadoop with MPP DW 
16 © 2014 Impetus Technologies 
Recorded version available at 
http://www.impetus.com/w...
Selecting the Right Technology 
Key considerations 
– $ per TB 
– Business Continuity/ Cost/ Vendor Lock-in 
– Latency Nee...
Choosing MPP 
$ per TB Driven 
– EMC Greenplum 
– Teradata, Aster 
– HP Vertica 
– Oracle Exadata 
– Netezza 
– ParAccel 
...
Faster Map Reduce & Hadoop 
Business Continuity/ Cost/ Vendor Lock-in 
– MapR 
– HPCC 
– Hadapt 
– Pervasive DataRush, HSt...
OLTP: NoSQL Solutions 
Latency Needs 
• Column stores 
– HBase, Cassandra 
• Documents stores 
– MongoDB, CouchDB 
• Key s...
OLTP: New Era RDBMS Version 
• Postgres, InfiniDB, Infobright 
• MySQL Cluster 
• GridSQL, EnterpriseDB 
• MS SQL 
• Sybas...
Recommendations- Cost Components of Big 
Data Warehouse 
• Initial Entry Costs - Cost of Experimentation 
We recommend – F...
Recommendations- Hardware & Software 
• Cost of Storage- Compress Data 
We recommend – Opting for RainStor/ similar soluti...
24 © 2014 Impetus Technologies 
About Impetus
• Strategic partners for software product engineering and 
R&D 
• Thought leaders in cutting-edge technologies 
• Mature p...
Big Data Quick Start Program 
Three Modules 
• Gear up (1 day session) 
• Base Camp (4 day session) 
• Summit (5 day sessi...
27 © 2014 Impetus Technologies 
Q & A
28 © 2014 Impetus Technologies 
Thank You 
Write to us at inquiry@impetus.com 
Follow us on Twitter @impetustech 
Recorded...
Upcoming SlideShare
Loading in …5
×

Does Big Data Spell Big Costs- Impetus Webinar

1,026 views
962 views

Published on

Impetus webinar on ‘Does Big Data Spell Big Costs? ‘

Register at http://bit.ly/zAXFmV

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,026
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Does Big Data Spell Big Costs- Impetus Webinar

  1. 1. Impetus Technologies Inc. Does Big Data Spell Big Costs? 1 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  2. 2. Outline • Big Data – Current Scenario • Cost components in a Big Data Warehouse • Best Practices - Reducing the cost of Big Data solutions – Cost of storage – Technologies- What and Where? – Big Data strategies – Our recommendations to reduce TCO 2 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  3. 3. Big Data – Current Scenario 2.5 Quintillion Bytes produced every day $6 Trillion Big data cost IDC/EMC $650 Billion 3 © 2014 Impetus Technologies Cost of wasted productivity because of information overload 1ZB Estimated Internet Traffic by 2015 1800EB Size of the digital universe in 2011 90% 90% of the data in the world today has been created in the last two years alone 18 Months Estimated time for the digital universe to double Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  4. 4. Age of Data Age of Software Age of Data 4 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  5. 5. Existing State of Big Data 5 © 2014 Impetus Technologies Solution
  6. 6. Using Commodity H/w for Big Data Commodity Hardware • Pros – Build your own – The promise of innovation • Cons – Building reliable storage – $1 per GB – Add the cost of managing / monitoring / hosting 6 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  7. 7. Using Open Source & Cloud Computing Open Source – Pros • Software is free !! Glory to the Elephant – Cons • Cost of Training – thinking parallel is not intuitive • Cost of Support – support is not free Cloud Computing – Pros • Rent what you need – Cons • $14,000 a month for 100 TB data – storage only 7 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  8. 8. Big Data Warehouse- Cost Components • Initial entry costs- Cost of experimentation • Cost of integration and moving data - Cost of ETL • Query and analytics capability • Manageability • On-going maintenance - Monitoring and tuning • Changing capacity - Additional hardware • Cost of compliance 8 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  9. 9. Lowering TCO of Big Data • Hardware – Lower cost of storage – Lower cost of computation • Software – Make things faster – Do more with less 9 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  10. 10. How to reduce the cost of storage? – Compress – RainStor and similar solutions • Just make sure your ‘Read Throughput’ is high – Retain all v/s load & process • Setup “data pipelines” or use ILM Principles • Creation and Receipt • Distribution • Use • Maintenance • Disposition – Focus on Big Data but don’t forget the “Small Data” 10 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  11. 11. Technologies: What and Where? What? • Open Source vs. Commercial software? • Specialized hardware/appliances vs. commodity hardware? • Vendor lock-in vs. vendor independence? • Cost of latencies? • Cloud? Where? • OLTP - NoSQL v/s OLAP - DW (MapReduce & MPP) 11 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  12. 12. OLAP: Big Data Scenarios 12 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  13. 13. Data Tapping Point, Cost & Latency 13 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  14. 14. Indirect Analytics over Hadoop 14 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  15. 15. Direct Analytics over Hadoop 15 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  16. 16. Analytics over Hadoop with MPP DW 16 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  17. 17. Selecting the Right Technology Key considerations – $ per TB – Business Continuity/ Cost/ Vendor Lock-in – Latency Needs 17 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  18. 18. Choosing MPP $ per TB Driven – EMC Greenplum – Teradata, Aster – HP Vertica – Oracle Exadata – Netezza – ParAccel – Others 18 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  19. 19. Faster Map Reduce & Hadoop Business Continuity/ Cost/ Vendor Lock-in – MapR – HPCC – Hadapt – Pervasive DataRush, HStreaming – Cloud Map Reduce – DataStax – Platform Computing – MARS, GPMR – ParStream 19 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  20. 20. OLTP: NoSQL Solutions Latency Needs • Column stores – HBase, Cassandra • Documents stores – MongoDB, CouchDB • Key stores – Redis, Riak etc.; Kyoto Cabinet/Tokyo Tyrant, Berkley • GraphDB – Neo4j • Cloud stores – SimpleDB 20 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  21. 21. OLTP: New Era RDBMS Version • Postgres, InfiniDB, Infobright • MySQL Cluster • GridSQL, EnterpriseDB • MS SQL • Sybase IQ • Specialized stores – VoltDB, MarkLogic, Clustrix • Xeround • ParStream • Oracle NoSQL 21 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  22. 22. Recommendations- Cost Components of Big Data Warehouse • Initial Entry Costs - Cost of Experimentation We recommend – Follow Best Practices , Learn or Hire • Cost of Integration and Moving Data- Cost of ETL We recommend - Remove costly licensed tools, switch to Map Reduce for ETL or ELT • Manageability - Provisioning, management tools We recommend – Opt for multi-vendor management toolsets, e.g. Impetus Ankush • On-Going Maintenance- Monitoring and Tuning We recommend – Automate! Automate! Automate! • Changing Capacity - Additional Hardware Do you know the GPU? 22 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  23. 23. Recommendations- Hardware & Software • Cost of Storage- Compress Data We recommend – Opting for RainStor/ similar solutions • Do More with Less - Faster MR We recommend – MapR/ similar solutions – Acunu and related solutions for NoSQL 23 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  24. 24. 24 © 2014 Impetus Technologies About Impetus
  25. 25. • Strategic partners for software product engineering and R&D • Thought leaders in cutting-edge technologies • Mature processes and practices that are methodical, yet flexible • Diverse domain expertise 25 © 2014 Impetus Technologies Our Services in Big Data and Analytics Expert Consulting Proof of Concept & Implementation Support Services Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  26. 26. Big Data Quick Start Program Three Modules • Gear up (1 day session) • Base Camp (4 day session) • Summit (5 day session) 26 © 2014 Impetus Technologies Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56
  27. 27. 27 © 2014 Impetus Technologies Q & A
  28. 28. 28 © 2014 Impetus Technologies Thank You Write to us at inquiry@impetus.com Follow us on Twitter @impetustech Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

×