Exploring cloud for data warehousing
Upcoming SlideShare
Loading in...5
×
 

Exploring cloud for data warehousing

on

  • 875 views

Description of the basic cloud principles, the cost & deployment model for cloud, shortcomings for BI workloads beyond modest scale, some stats on market adoption/preference of cloud for DW.

Description of the basic cloud principles, the cost & deployment model for cloud, shortcomings for BI workloads beyond modest scale, some stats on market adoption/preference of cloud for DW.

Statistics

Views

Total Views
875
Views on SlideShare
866
Embed Views
9

Actions

Likes
0
Downloads
13
Comments
0

2 Embeds 9

http://www.brijj.com 6
https://twitter.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Exploring cloud for data warehousing Exploring cloud for data warehousing Presentation Transcript

  • Exploring Cloud Computing Options forData WarehousingJuly 26, 2012Mark Madsen@markmadsenwww.ThirdNature.net
  • Cloud Computing" a model for enabling ubiquitous convenient on …a model for enabling ubiquitous, convenient, on‐demand network access to a shared pool of configurable computing resources (e g networks servers storagecomputing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal managementprovisioned and released with minimal management effort or service provider interaction." http://csrc nist gov/publications/nistpubs/800-145/SP800-145 pdfhttp://csrc.nist.gov/publications/nistpubs/800 145/SP800 145.pdfWhat people see: seemingly infinite resource to apply to performance problems on short notice and at low cost
  • Generators: Expensive Product
  • Generators: Commodity Product
  • Generators as a Service: Electricity
  • The Natural Process of CommoditizationSimon Wardley, A Lifecycle Approach to Cloud Computing
  • Managing Hardware ResourcesSystems are sized for the peak workload, with the expectation that it will fluctuate.CapacityDemandResourcesTime
  • Idle resources = low utilizations = money wastedIdle resources   low utilizations   money wastedCapacityCapacityIdle resourcesDemandResourcesTime
  • Not enough resource is (much) worse than too much.CapacityDemandCapac tyResourcesTime
  • Maintaining capacity just above the peak asMaintaining capacity just above the peak as workloads increase is the art of capacity planning.One problem is the large step when upgrading toOne problem is the large step when upgrading to more resources, equating to a large capital cost.CapacityCapacityDemandResourcese a dTime
  • Great performance after an upgrade, badGreat performance after an upgrade, bad performance at year‐end before the next upgrade.A steady decline can be worse for user perceptionA steady decline can be worse for user perception than constant mediocre performance.Capacityp yIdleDemandResourcesTime
  • What everyone would like: elastic capacityPay for the resources you use when you use them,Pay for the resources you use when you use them, not up front for the entire system that supplies them. Just like electricityJust like electricity.Capacityp yResourcesD dTimeDemandTime
  • Five Key Cloud Characteristics1. On‐demand self‐service2. Network accessibility3. Resource pooling4 Measured service4. Measured service5. Elasticityy
  • Cloud ArchitectureStarted with virtual machinesM M M M MMem Mem Mem Mem MemLots of servers, lots of virtual nodes. But in public clouds:CPUDiskCPUDiskCPUDiskCPUDiskCPUDisk• Storage can, often is separated• VMs don’t run across nodesDisk Disk Disk Disk Disk• Great for OLTP, not so much for BI• Implies new software architecturesMemorypMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskCPUsMemoryCPUsMemoryCPUsMemoryCPUsShared disk Shared disk Shared disk Shared disk
  • Database Architecture and the CloudVirtualizing on a single server makes no sense for a database that needs If your server hardware environment looks like this:the full resources.then it’s probably good for MemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUDiskMemCPUMemCPUMemCPUMemCPUMemCPUp y glightweight transaction processing, simple storage and i l d lDisk Disk Disk Disk Diskretrieval, procedural computations on data.MemoryIf you want to use it for a data warehouse, you need:CPUsShared disk• A shared‐nothing database• A proper storage architecture• D i li i• Dynamic licensing
  • Three Models of Deployment2 Leased / hosted2. Leased / hostedprivate cloud1. Public cloud3. Private cloud
  • Benefits and RationaleWh did / id i t th l d?Why did you / are you considering a move to the cloud?Two primary reasons:▪ Cost reduction▪ Reduced time to value47%50%Hardware savingsPay only for what we use42%44%46%Lower outside maintenance costsLower labor costsSoftware license savings Cost reduction40%40%42%Able to take advantage of latest functionalityReduce IT support needsLower outside maintenance costs39%39%39%Able to scale IT resources to meet needsRelieve pressure on internal resourcesRapid deploymentReduce time to value39%Resolve problems related to updating/upgradingIBM global survey of IT and line-of-business decision makers
  • Unexpected BenefitsSpeed to deploy:▪ opex vs capex means faster approvals and less planningless planning▪ Provision on‐demand means ability to do all those small projects that needed resourcesthose small projects that needed resources and staff to set upPerformance management:▪ Resource‐oriented fixes done in minutes▪ Instead of static resources and fluctuations in performance, set static SLAs and fluctuate the resourcesAdministration:Administration:▪ No more hardware or operating system upgrades to deal withupgrades to deal with
  • Public Cloud Challenges1. Multi‐tenant servers and unpredictable I/O performance2. Legal problems:▪ Data co‐mingling in multi‐tenant databases▪ Data locality and national laws3. Cloud compatibility for data integration and data management ( )tools (environment, data movement)4. Security requirementsWhen these are a concern, private clouds may be the better option today.
  • What are manager preferences?Prefernottouse cloud21%44%35%Data warehouses or data martsPrefer not to use cloudPrivate cloud preferencePublic cloud preference9%52%39%Data mining, text mining, or other analytics9%yIBM global survey of IT and line-of-business decision makers
  • Comparison of Models
  • New and growing use cases drive the need to expandThe use cases are now interactive applications, lower latency data, complex analytics and rapidly growing data volumes.
  • Image AttributionsThanks to the people who supplied the images used in this presentation:Commoditization diagram – from A Lifecycle Approach to Cloud Computing, © Simon WardleyCommoditization diagram  from A Lifecycle Approach to Cloud Computing, © Simon Wardleytesla coil train ‐ http://www.flickr.com/photos/winterhalter/27364687Amazon Virtual Private Cloud diagram‐ © Amazon, Inc..caged_tower_melbourne.jpg ‐ http://www.flickr.com/photos/vermininc/2227512763
  • About the PresenterMark Madsen is president of ThirdNature, a technology research andconsulting firm focused on businessconsulting firm focused on businessintelligence, analytics andinformation management. Mark is anaward-winning author architect andaward winning author, architect andformer CTO whose work has beenfeatured in numerous industrypublications. During his career Markpublications. During his career Markreceived awards from the AmericanProductivity & Quality Center, TDWI,Computerworld and the SmithsonianpInstitute. He is an internationalspeaker, contributing editor atIntelligent Enterprise, and managesg p gthe open source channel at theBusiness Intelligence Network. Formore information or to contact Mark,visit http://ThirdNature.net.
  • About Third NatureThird Nature is a research and consulting firm focused on new andemerging technology and practices in business intelligence, analytics andperformance management. If your question is related to BI, analytics,p g y q yinformation strategy and data then you‘re at the right place.Our goal is to help companies take advantage of information-drivent ti d li ti W ff d ti ltimanagement practices and applications. We offer education, consultingand research services to support business and IT organizations as well astechnology vendors.We fill the gap between what the industry analyst firms cover and what ITneeds. We specialize in product and technology analysis, so we look atemerging technologies and markets e al ating technolog and h it isemerging technologies and markets, evaluating technology and hw it isapplied rather than vendor market positions.