The Intelligent Thing -- Using In-Memory for Big Data and Beyond


Published on

The Briefing Room with John O'Brien and Teradata
Live Webcast on June 11, 2013

For traditional Data Warehousing and Big Data Analytics, research shows that a small percentage of enterprise data often comprises the lion's share of what's needed for queries. That's hot data, and organizations that know how to effectively harness that data can stay on top of what's happening. Conversely, cold data can certainly provide value at times, but should ideally be stored in ways that minimize cost. The more dynamically a company can manage this hot and cold data, the more efficient its information systems become.

Register for this episode of The Briefing Room to hear veteran database expert John O'Brien of Radiant Advisors as he outlines a strategy for managing hot and cold data. He'll be briefed by Alan Greenspan of Teradata, who will tout his company's Intelligent In-Memory solution, which optimizes the management of hot and cold data to keep analysts fueled with the data they need most. He'll also discuss Teradata Virtual Storage, which helps optimize the storage and provisioning of information assets.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Intelligent Thing -- Using In-Memory for Big Data and Beyond

  1. 1. The Briefing RoomThe Intelligent Thing—Using In-Memory for Big Data and Beyond
  2. 2. Twitter Tag: #briefr The Briefing RoomWelcomeHost:Eric
  3. 3. Twitter Tag: #briefr The Briefing Room!   Reveal the essential characteristics of enterprise software,good and bad!   Provide a forum for detailed analysis of today s innovativetechnologies!   Give vendors a chance to explain their product to savvyanalysts!   Allow audience members to pose serious questions... and getanswers!Mission
  4. 4. Twitter Tag: #briefr The Briefing RoomJUNE: DatabaseJuly: CLOUDAugust: HIGH PERFORMANCE ANALYTICSSeptember: ANALYTICS
  5. 5. Twitter Tag: #briefr The Briefing RoomDatabase
  6. 6. Twitter Tag: #briefr The Briefing RoomAnalyst: John O’BrienJohn O’Brien isFounder and CEO ofRadiant Advisors
  7. 7. Twitter Tag: #briefr The Briefing Room!   Teradata is known for its data analytics solutions with afocus on integrated data warehousing, big data analyticsand business applications!   It offers a broad suite of technology platforms andsolutions; data management applications; and data miningcapabilities!   Teradata Intelligent Memory is a new capability thatprovides automated management of data based ontemperatureTeradata
  8. 8. Twitter Tag: #briefr The Briefing RoomAlan GreenspanAlan Greenspan is Product Marketing Manager forTeradata Corporation. He is responsible for productmarketing for the Teradata database, key databasetechnologies, security and performance. Alan hasmore than 20 years with Teradata Corporation. 
  10. 10. 10 Teradata ConfidentialTrends•  Memory is 3,000 times faster than disk•  Memory per node is increasing>  96GB -> 256GB ->512GB -> 768GB -> 1TB•  Cost of memory is decreasingIssues•  Memory still 80x more expensive than disk•  Not all data fits into memory•  Not all data worth 80x premiumTeradata Solution•  Create a new extended memory spacefor most frequently accessed dataExploiting Technology Trends
  11. 11. 11 Teradata ConfidentialTeradata Intelligent MemoryInnovative In-Memory Technology•  New extended memory space•  Improves query performance•  A smarter approach than in-memory databases•  Leverage large memory capacities in new platformsTeradata Intelligent Memory
  12. 12. 12 Teradata Confidential•  Sophisticatedalgorithms totrack usage,measuretemperature,and rank data•  ComplimentsFSG cache•  Dynamicallyadjusts tonew querypatternsNew Extended Memory SpaceIntelligentMemorymostrecentlyused datamostfrequentlyused dataHottest data placed andmaintained in memory,aged out as it coolscool outvery hot inFSGCacheTemporarily store datarequired for currentqueries, purges leastrecently used
  13. 13. 13 Teradata Confidential•  1% of data satisfies43% of query activity•  Hottest data inmemory/not all thedata•  Integrated intoTeradata system•  No need for separateapplianceImproves Query PerformancePerformance of in-memorydatabases without their cost
  14. 14. 14 Teradata Confidential•  Automatic•  Transparent•  No DBA effort•  No SQL changes•  Maintain useraccess to ALLdata foranalysisA Smarter Approach than In-Memory DatabasesExtend multi-temperature datamanagement to memory
  15. 15. 15 Teradata Confidential• Memory Capacities Growing Exponentially• Traditional Cache Reaches DiminishingReturns• Data is stored compressed and in columnsand rows• Created extended memory space beyondcache• Use it in a new innovative wayLeverage Large Memory Capacities in NewPlatforms
  16. 16. 16 Teradata ConfidentialTeradata Workload-Specific Platforms670Future27006700Data MartApplianceExtreme DataApplianceData WarehouseApplianceActive EnterpriseData WarehouseTeradata Intelligent Memory
  17. 17. 17 Teradata Confidential•  All Members of the Workload-Specific Platform Family•  Minimum Memory Requirements>  Recent models only>  May require memory upgrade•  Requires Teradata Database 14.10•  Teradata Virtual Storage is not requiredConfiguration RequirementsModel Memory/NodeFSG Cache +I.M./AMPActive Enterprise DataWarehouse6700 512GB 8GBData WarehouseAppliance2700 256GB 5GBData Mart Appliance 670H 256GB TBDExtreme DataApplianceFuture TBD TBD
  18. 18. 18 Teradata Confidential• Teradata SQL-Hallows Hadoop data totake advantage ofTeradata IntelligentMemory• Hadoop data that ispersisted in Teradataand becomes very hotwill dynamically moveinto TeradataIntelligent MemoryTeradata Intelligent Memory and UDA
  19. 19. 19 Teradata ConfidentialTeradataIntelligentMemoryIn-MemoryDatabasesAll data in memory Wrong goal Small data setsBig data per node Yes NoColumnar Yes + rows YesMemory-speed performance Yes YesCompression Yes YesRecovery snapshot Yes YesSSD/HDD logging Yes YesIndexes, aggregates Yes NoLarge node memories Yes YesIntelligent Memory vs In-Memory Databases
  20. 20. 20 Teradata ConfidentialReload on RebootCandidatesVHcylinders temp0Cyl 56 100Cyl 21 100Cyl 22 99Cyl 88 99Cyl 42 98Cyl 66 95Intelligentmemory
  21. 21. 21 Teradata Confidential•  In memory expectations>  All “my queries” are faster•  Business value>  Majority of queries are faster>  Increased response time•  Intelligent Memory won’t help>  CPU constrained queries>  Deep history queries–  Very Hot + cold data joins>  1-3 second queries>  Data loadingNot Every Workload is IO BoundNode
  22. 22. 22 Teradata ConfidentialTeradata Intelligent Memory Sample QuotesMay 2013 Coverage ReportTeradata takes on SAPs HANA with in-memorytechnologies push"Teradata Intelligent Memory technology is built intothe data warehouse and customers dont have tobuy a separate applianceTeradata gets into the in-memory biz to take on SAP’sHANAData analytics veteran Teradata will not let the new era ofdata-analysis architectures pass it by without a fight. It hasalready built products to address massive data volumes andHadoopTeradata boosts DRAM on appliances for in-memoryqueriesYou dont need no stinkin HANA or ExalyticsTeradata enters the in-memory fray,intelligentlyTeradata Intelligent Memory combines RAMand disk for high-performance Big Data withoutthe extreme requirement of exclusive in-memory operationTeradata Extends In-Memory Computing ReachTeradata Intelligent Memory, an approach to in-memory computing that allows the workloads runningon a Teradata database appliance to make use ofextended memory.Teradata Leverages In-Memory TechnologyFor Big DataTeradata (TDC) introduced Intelligent Memory, anew database technology that creates extendedmemory space
  23. 23. 23 Teradata Confidential
  24. 24. Twitter Tag: #briefr The Briefing RoomPerceptions & QuestionsAnalyst:John O’Brien
  25. 25. © Copyright 2013 Radiant Advisors. All Rights ReservedREDEFININGHOT AND COLDDATA25Inside Analysis – Teradata Intelligent Memory SystemJune 11, 2013John O’Brien | Principal and Founder, Radiant Advisors@obrienjw
  26. 26. © Copyright 2013 Radiant Advisors. All Rights ReservedILM PRINCIPLERedefining Hot and Cold DataInformation Management Lifecycle:“Storage is optimized when the value ofinformation is persisted on the correspondingstorage cost.”By using the age of the information, users candefine its value as hot, warm, or coldtemperatures then leverage corresponding tiersof data storage…26
  27. 27. © Copyright 2013 Radiant Advisors. All Rights ReservedPREVIOUS DATA AGING STORAGE TIERSRedefining Hot and Cold DataInformation Lifecycle Challenges:•  Requires business usagedefinition to script migration•  Different business data mayhave different aging policies•  Not all policies are time based(status based)•  Marking read-only, backups•  Isolate data, partition-based- Very operational oriented -Try to analyze 3 years of businessactivity by demographic, products,or locations(hits weakest link storage tier)27Database Server (SMP)Fast DisksFast ConnectivitySmaller CapacityFast DisksSSDMedium Disks15,000 rpmMedium Disks7,200 rpmSlow Disks5,400 rpmMedium DisksFast ConnectivityMedium CapacityMedium DisksSlow ConnectivityMedium CapacityBulk DisksSlow ConnectivityHigh CapacityTapeStorage Sub-SystemsTapeSlow ConnectivityHighest CapacityDefining and Managing Individual Data Record Policies for each tier1-30 days 30-90 days 3-12 mos. 12-24 mos. 2+ yearsThis Month This Quarter This Year Yr-over-Yr compliance$ $ $ $ $ $ $ $ $ $ $ $ $ $ $
  28. 28. © Copyright 2013 Radiant Advisors. All Rights ReservedMPP SOLVES ANALYTIC WORKLOADSRedefining Hot and Cold DataMPP Multi-tier challenges:•  Still requires business usagedefinition to script datamigration•  Partition key setting and dataskewnessParallelism overcomes weakestlink partition isolationDoes the age of a recordcorrespond to its value inanalytics?28Database Server (MPP)Fast NodesSmall CapacityMore CPU/MemoryFast DisksSolid State DisksMedium Disks1s terabytes per nodeSlow Disks10s Terabytes per nodeMedium NodesMedium CapacityAvg CPU/MemoryBulk NodesHigh CapacityLow CPU/MemoryNode ArrayDefining and Managing Individual Data Record Policies for each MPP tier1-30 days 1 – 12 mos. 1 - n years onlineThis Month This Year History$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
  29. 29. © Copyright 2013 Radiant Advisors. All Rights ReservedOPTIMIZING THE MPP PLATFORMRedefining Hot and Cold DataIntelligent Memory:•  Determines value of data byits usage in the business viaactivity metrics andalgorithms•  Automatically andtransparently moves data tothe appropriate tier•  Bi-directional data movementis heating up or cooling off•  Loaded data can start hotand cool offMPP to overcome partitioningUsage to govern storage tiers29Database Server (MPP)Fast NodesSmall CapacityMore CPU/MemoryFast DisksSolid State DisksMedium Disks1s terabytes per nodeSlow Disks10s Terabytes per nodeMedium NodesMedium CapacityAvg CPU/MemoryBulk NodesHigh CapacityLow CPU/MemoryNode ArrayIntelligent Memory management based on business usageHot Warm ColdMost Often Occasional Use Rarely Used and Available
  30. 30. © Copyright 2013 Radiant Advisors. All Rights ReservedTHE NEW PARADIGM FOR ANALYTICSRedefining Hot and Cold DataBy using the age of the information, users can defineits value as hot, warm or cold temperatures matchingcorresponding tiers of storage…Which meta data represents analytic value?Monitoring a BI system’s analytic usage, the systemcan define its analytic value as hot, warm, or coldtemperatures and then transparently persist dataintelligently30
  31. 31. © Copyright 2013 Radiant Advisors. All Rights ReservedTHANK YOU!For more informationwww.RadiantAdvisors.comTwitter: @RadiantAdvisors #ModernBI #RediscoveringBIRSS: feed:// us at: info@RadiantAdvisors.comLinked IN: Rediscovering BI monthly
  32. 32. © Copyright 2013 Radiant Advisors. All Rights ReservedQUESTIONS•  If Teradata Intelligent-Memory can optimize a BI system’sstorage persistence, how do you know what percentage ofeach storage tier to configure beforehand? Is it simply aneconomic decision at that point (the most memory and fast diskthat I can afford)?•  For the secret-sauce algorithms being used in the IOPsmonitoring by TIM, generally how fast do data sets “warm up”or “cool off” with usage?•  If I can anticipate high usage for a given data set on anupcoming Monday morning event, is there a way to bypasswarming up and designate the hot?•  What are the boundaries for TIM optimization within TeradataAster and are there future plans for expansion andenhancements?32
  33. 33. Twitter Tag: #briefr The Briefing Room
  34. 34. Twitter Tag: #briefr The Briefing RoomJuly: CLOUDAugust: HIGH PERFORMANCE ANALYTICSSeptember: ANALYTICSUpcoming
  35. 35. Twitter Tag: #briefr The Briefing RoomThank Youfor YourAttention