Database as a Service - Tutorial @ICDE 2010

Database as a ServiceSeminar, ICDE 2010, Long Beach, March 04Wolfgang Lehner | Dresden University of Technology, Germany Kai-Uwe Sattler | Ilmenau University of Technology, Germany 1

IntroductionMotivationSaaSCloud ComputingUseCases2

Software as a Service (SaaS)Traditional SoftwareOn-DemandUtilityPlug In, SubscribePay-per-UseBuild Your Own 3

Avoidhiddencostof traditional SWTraditional SoftwareSaaSSW LicensesSubscription FeeTrainingTrainingCustomizationHardwareIT StaffMaintenanceCustomization5

The Long TailDozens of markets of millions or millions of markets of dozens?Your Large Customers$ / CustomerWhat if you lower your cost of sale (i.e. lower barrier to entry) and you also lower cost of operationsYour Typical CustomersNew addressable market >> current market(Currently) “non addressable” Customers# of Customers6

Acquisition ModelServiceBusiness ModelPay for usageAccess ModelInternetTechnical ModelScalable, elastic, shareableEC2 & S3"All that matters is results — I don't care how it is done"Cloud Computing:A style of computing where massively scalable, IT-enabled capabilities are provided "as a service" across the Internet to multiple external customers."I don't want to own assets — I wantto pay for elastic usage, like a utility""I want accessibility from anywhere from any device""It's about economies of scale, with effective and dynamic sharing"What is Cloud? – Gartner’s Definition7

To Qualify as a CloudCommon, Location-independent, Online Utility on Demand*Common implies multi-tenancy, not single or isolated tenancy Utility implies pay-for-use pricingonDemandimplies ~infinite, ~immediate, ~invisible scalability Alternatively, a “Zero-One-Infinity” definition:**0On-premise infrastructure, acquisition cost, adoption cost, support cost1Coherent and resilient environment – not a brittle “software stack”Scalability in response to changing need, Integratability/ Interoperability with legacy assets and other services Customizability/Programmability from data, through logic, up into the user interface without compromising robust multi-tenancy * Joe Weinman, Vice President of Solutions Sales, AT&T, 3 Nov. 2008** From The Jargon File: “Allow none of foo, one of foo, or any number of foo”8

Cloud Differentials: Service Models9Cloud Software as a Service (SaaS)Use provider’s applications over a network Cloud Platform as a Service (PaaS)Deploy customer-created applications to a cloud Cloud Infrastructure as a Service (IaaS)Rent processing, storage, network capacity, and other fundamental computing resources

Cloud Differentials: Characteristics10PlatformPhysical – VirtualHomogenous – HeterogeneousDesign ParadigmsStorageCPUBandwidthUsage ModelExclusiveSharedPseudo-SharedSize/LocationLarge Scale(AWS, Google, BM/Google), Small Scale(SMB, Academia)PurposeGeneral PurposeSpecial Purpose (e.g., DB-Cloud)Administration/JurisdictionPublicPrivate

UseCases: Large-Scale Data AnalyticsOutsourceyourdata and usecloudresourcesforanalysisHistorical and mostlynon-criticaldataParallelizable, read-mostlyworkload, high variantworkloadsRelaxed ACID guaranteesExamples (HadoopPoweredBy):Yahoo!: researchfor ad systems and Web searchFacebook: reporting and analyticsNetseer.com: crawling and log analysisJourney Dynamics: trafficspeedforecasting11

UseCases: Database HostingPublic datasetsBiologicaldatabases: a singlerepositoryinstead of > 700 separate databasesSemantic Web Data, Linkeddata, ...Sloan Digital Sky SurveyTwitterCacheAlready on Amazon AWS: annotated human genomedata, US census, Freebase, ...Archiving, Metadata Indexing, ...12

UseCases: Service HostingData managementforSaaSsolutionsRun theservicesnearthedata= ASPAlreadymanyexistingapplicationsCRM, e.g. Salesforce, SugarCRMWeb AnalyticsSupply Chain ManagementHelpDesk ManagementEnterprise ResourcePlanning, e.g. SAP Business ByDesign...13

Foundations & ArchitecturesVirtualizationProgrammingmodelsConsistencymodels & replicationSLAs & WorkloadmanagementSecurity14

Topics covered in this SeminarQuery & Programming ModelLogical Data ModelVirtuali-zationMulti-TenancyService Level AgreementsStorage ModelDistributedStorageReplicationSecurity15

Current Solutionsuserperspectiveone DB for all clientsone DB per clientVirtualizationReplication16DistributedStoragephysicalperspective

VirtualizationSeparating the abstract view of computing resources from the implementation of these resourcesaddsflexibility and agility to the computing infrastructuresoften problems related to provisioning, manageability, …lowers TCO: fewercomputingresourcesClassicaldrivingfactor: serverconsolidation18E-mail serverWeb serverDatabase serverE-mail serverDatabase serverLinuxLinuxLinuxLinuxLinuxEDBT2008 Tutorial (Aboulnaga e.a.)Web serverLinuxVirtualizationConsolidate Improved utilization using consolidation

Whatcanbevirtualized – thebigfour.19

Different TypesofVirtualization20APP 1APP 4APP 2APP 3APP 5OPERATING SYSTEMOPERATING SYSTEMVIRTUAL MACHINE 1VIRTUAL MACHINE 2CPUCPUCPUMEMMEMNETVIRTUAL MACHINE MONITOR (VMM)PHYSICAL STORAGEPHYSICAL MACHINECPUMEMNETCPUCPU

Virtual Machines21Technique with long history (since the 1960's)Prominent since IBM 370 mainframeseriesTodaylarge scalecommodity hardware and operating systemsVirtual Machine Monitor (Hypervisor)strong isolation between virtual machines (security, privacy, fault tolerance)flexible mapping between virtual machines and physical resourcesclassical operationspause, resume, checkpoint, migrate (admin / load balancing)Software deploymentPreconfigured virtual appliancesRepositories of virtual appliances on the web

DBMS on top of Virtual Machines... yetanotherapplication?... Overhead?SQL Server withinVMware22

Virtualization Design AdvisorWhat fraction of node resources goes to what DBMS?Configuring VM parametersWhat parameter settings are best for a given resource configurationConfiguringthe DBMS parametersExampleWorkload 1: TPC-H (10GByte)Workload 2: TPC-H (10GByte) only Q18 (132 copies)Virtualization design advisor20% of CPU to Workload 180% of CPU to Workload 223

Some ExperimentsWorkload Definition based on TPC-HQ18 isoneofthemost CPU intensive queriesQ21 isoneofthe least CPU intensive queriesWorkload UnitsC: 25x Q18I: 1x Q21Experiment: Sensitivity to workloadResource NeedsW1 = 5C + 5IW2 = kC + (10-k)I (increaseof k -> more CPU intensive)PostgresDB224

Some Experiments (2)Workload SettingsW3 = 1CW4 = kCWorkload SettingsW5 = 1CW6 = kI25

Virtualization in DBaaS environmentsDB LayerDB ServerDB ServerDB ServerDBDBDBDBDBInstance LayerInstanceInstanceInstanceInstanceInstanceInstanceDB Server LayerVMVMVMVMVMVMVM LayerHW Layer26

Existing Tools for Node VirtualizationDB ServerDB LayerDBDBDBDBDBDB Ad2visorIndexes

Redistribution of TablesDB Workload ManagerInstance LayerInstanceInstanceDB Server LayerStatic Environment Assumptions Advisor expects static hardware environment

VM expects static (peak) resource requirements

Interactions between layers can improve performance/utilizationNodeRessource ModelVMVMVMVM LayerVM ConfigurationMonitoring

Layer Interactions (2)ExperimentDB2 on LinuxTPC-H workload on 1GB databaseRanges for resource grantsMain memory (BP) – 50 MB to 1GBAdditional storage (Indexes) – 5% to 30% DB sizeVarying advisor output (17-26 indexes)Different possible improvementDifferent expected Performance after improvementDB AdvisorExpected PerformancePossible ImprovementIndex StorageIndex Storage35%90%25%25%20%20%15%15%<1%<3%10%10%VM Configuration5%5%200MB400MB600MB800MB1GB200MB400MB600MB800MB1GBBPBP28

Storage VirtualizationGeneral Goalprovide a layerofindircetiontoallowthedefinitionofvirtualstoragedevicesminimize/avoiddowntime (local and remote mirroring)improveperformance (distribution/balancing – provisioning - controlplacement)reducecostofstorageadministrationOperationscreate, destroy, grow, shrinkvirtualdeviceschangesize, performance, reliability, ...workloadfluctuationshierarchicalstoragemanagementversioning, snapshots, point-in-time copiesbackup, checkpointsexploit CPU and memory in the storage systemcachingexecutelow-level DBMS functions29

Virtualization in DBaaS Environments (2)DB LayerDB ServerDB ServerDB ServerDBDBDBDBDBInstance LayerInstanceInstanceInstanceInstanceInstanceInstanceDB Server LayerVMVMVMVMVMVMVM LayerShared DiskHW LayerStorage Layer30Local Disk

Virtualization in DBaaS Environments (2)DB LayerDBDBDBDBDBDB ServerInstance LayerInstanceInstanceDB Server LayerVMVMVMVM LayerHW LayerStorage Layer31DB AdvisorIndexes

Redistribution of TablesDB Workload ManagerStorageRessource ModelStorage ConfigurationDevice Bundling

ArchivingShared DiskLocal Disk

Onewaytogo? ParavirtualizationCPU and Memory Paravirtualizationextendstheguest to allow direct interaction withtheunderlyinghypervisorreducesthemonitorcostincludingmemoryand System calloperations.gainsfromparavirtualizationareworkloadspecificDevice Paravirtualizationplaces a highperformancevirtualization-aware device driver into the guestparavirtualizeddriversaremoreCPU efficient (less CPU overhead forvirtualization)Paravirtualizeddriverscanalso take advantage of HW features, like partial offload

OutlineQuery & Programming ModelLogical Data ModelVirtuali-zationMulti-TenancyService Level AgreementsStorage ModelDistributedStorageReplicationSecurity33

Multi TenancyGoal: consolidate multiple customersontothesame operational systembest resourceutilizationflexible,butlimitedscalabilityseparate DBper tenantshared DBsharedschemashared DBseparate schemaRequirements:

Extensibility: customer-specificschemachanges

Security: preventingunauthorizeddataaccessesbyothertenants

Performance/scalability: scale-up & scale-out

Maintenance: on tenantlevelinstead of on databaselevel34

Flexible Schema ApproachesGoal: allowtenant-specificschemaadditions (columns)Universal TableExtension TablePivotTable35

Flexible Schema Approaches: ComparisonBest performanceFlexible schemaevolutionPivottableExtension tableChunkfoldingPrivate tablesApplicationownstheschemaDatabase ownstheschemaUniversal tableXML columnsUniversal table: requirestechniquesforhandlingsparsedataFine-grainedindexsupportnotpossiblePivottable:RequiresjoinsforreconstructinglogicaltuplesChunkfolding: similar to pivottablesGroup of columnsarecombined in a chunk and mappedinto a chunktableRequirescomplexquerytransformation36

Access Control in Multi-Tenant DBShared DB approachesrequirerow-levelaccesscontrolQuery transformation.... whereTenantID = 42 ...Potential securityrisksDBMS-levelcontrol, e.g. IBM DB2 LBACLabel-based Access controlControls read/writeaccess to individualrows and columnsSecuritylabelswithpoliciesRequires separate accountforeachtenant37

In a NutshellHow shall virtualization be handled onMachine level (VM to HW)DBMS level (database to instance to database server)Schema level (multi tenancy)... using …Allocation between layersConfiguration inside layersFlexible schemas… when …Characteristics of the workloads are knownVirtual machines are transparentTenant-specific schema extensions… demanding that …SLAs and security are respectedEach node’s utilization is maximizedNumber of nodes is minimized38

MapReduce Background40Programming model and an associated implementation for large-scale data processingGoogle and related approaches: Apache Hadoop and Microsoft DryadUser-defined map & reduce functionsInfrastructurehides details of parallelizationprovides fault-tolerance, data distribution, I/O scheduling, load balancing, ...map (in_key, in_value) -> (out_key, intermediate_value) listreduce (out_key,intermediate_value list) -> out_value listM{ (key,value) }RMRM

Logic Flow of WordCountMapperHadoop Map/Reduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner…1  Hadoop Map/Reduce is aHadoop 1Map  117  software framework forReduce  1is  145  easily writing applicationsa  1……Sort/ShuffleReducerHadoop [1, 1, 1, …,1]Hadoop 5Map  [1, 1, 1, …, 1]Map  12Reduce  [1, 1, 1, …, 1]Reduce  12is  [1, 1, 1, …, 1]is  42a  [1, 1, 1, …, 1]a  23

MapRecude DisadvantagesExtremely rigid data flowCommon operations must be coded by handjoin, filter, split, projection, aggregates, sorting, distinctUser plans may be suboptimal and lead to performance degradationSemantics hidden inside map-reduce functionsInflexible, difficult to maintain, extend and optimizeCombination of high-level declarative querying and low-level programming with MapReduce Dataflow Programming LanguagesHive, JAQL and PigMR42

PigLatinPigLatinOn top of map-reduce/ HadoopMix of declarative style of SQL and procedural style of map-reduceConsists of two partsPigLatin: A Data Processing LanguagePig Infrastructure: An Evaluator for PigLatin programsPig compiles Pig Latin into physical plans Plans are to be executed over Hadoop30% of all queriesat Yahoo! in Pig-LatinOpen-source, http://incubator.apache.org/pig43

ExampleTask: Determine the most visited websites in each category.URL InfoVisits44

ExampleWorkflow in Pig-Latinload URL Infoload Visitsvisits = load ‘/data/visits’ as (user, url, time);gVisits = group visits byurl;visitCounts = foreachgVisitsgenerateurl, count(visits);urlInfo = load ‘/data/urlInfo’ as (url, category, pRank);visitCounts = joinvisitCountsbyurl, urlInfobyurl;gCategories = groupvisitCountsby category;topUrls = foreachgCategoriesgenerate top(visitCounts,10);store topUrls into ‘/data/topURLs’;Operatedirectly over files.group by urlforeachurlgenerate countSchemas optional. Can be assigned dynamically.join on urlUser-defined functions (UDFs) can be used in every construct load, store

group, filter, foreachgroup by categoryforeachcategorygenerate top10 URLs46

Compilation in MapReduceEvery group or join operation forms a map-reduce boundaryOther operations pipelined into map and reduce phasesload URL Infoload VisitsMap1Map2group by urlReduce1foreachurlgenerate countjoin on urlReduce2Map3group by categoryReduce3foreachcategorygenerate top10 URLs47

Data warehouse infrastructure built on top of Hadoop, providing:Data SummarizationAd hoc queryingSimple query language: Hive QL (based on SQL)Extendable via custom mappers and reducersSubproject of HadoopNo „Hive format“http://hadoop.apache.org/hive/Hive48

Hive - ExampleLOAD DATA INPATH `/data/visits` INTO TABLE visitsINSERT OVERWRITE TABLE visitCountsSELECT url, category, count(*)FROM visitsGROUP BY url, category;LOAD DATA INPATH ‘/data/urlInfo’ INTO TABLE urlInfoINSERT OVERWRITE TABLE visitCountsSELECT vc.*, ui.*FROM visitCountsvc JOIN urlInfoui ON (vc.url = ui.url);INSERT OVERWRITE TABLE gCategoriesSELECT category, count(*)FROM visitCountsGROUP BY category;INSERT OVERWRITE TABLE topUrlsSELECT TRANSFORM (visitCounts) USING ‘top10’;49

Higher level query language for JSON documentsDeveloped at IBM‘s Almaden research centerSupports several operations known from SQLGrouping, Joining, SortingBuilt-in support forLoops, Conditionals, RecursionCustom Java methods extend JAQLJAQL scripts are compiled to MapReduce jobsVarious I/OLocal FS, HDFS, Hbase, Custom I/O adaptershttp://www.jaql.org/JAQL50

JAQL - ExampleregisterFunction(„top“, „de.tuberlin.cs.dima.jaqlextensions.top10“);$visits= hdfsRead(„/data/visits“);$visitCounts=$visits-> groupby $url = $into { $url, num: count($)};$urlInfo= hdfsRead(„data/urlInfo“);$visitCounts=join $visitCounts, $urlInfowhere $visitCounts.url == $urlInfo.url;$gCategories=$visitCounts-> group by $category = $ into {$category, num: count($)};$topUrls= top10($gCategories);hdfsWrite(“/data/topUrls”, $topUrls);51

ACID vs. BASETraditional distributeddatamanagementWeb-scaledatamanagementACIDBasicallyAvailableSoft-stateEventualconsistentStrongconsistencyIsolationFocus on „commit“Availability?PessimisticDifficultevolution (e.g. schema)WeakconsistencyAvailabilityfirstBest effortOptimistic (aggressive)Fast and simpleEasierevolution53

CAP Theorem [Brewer 2000]Consistency: all clientshavethesameview, even in case of updatesAvailability: all clients find a replica of data, even in thepresence of failuresTolerance to networkpartitions: systemproperties hold evenwhenthenetwork (system) ispartitionedYoucanhave at mosttwoof thesepropertiesforanyshared-data system.54

CAP TheoremNo consistencyguarantees➟ updateswithconflictresolutionOn a partitionevent, simplywaituntildataisconsistentagain➟ pessimisticlockingAll nodesare in contactwitheachotherorputeverything in a single box➟ 2 phasecommit55

CAP: ExplanationsPA :=update(o)PB:=read(o)1.3.2.MNetworkpartitions ➫ M isnotdeliveredSolutions?Synchronousmessage: <PA,M> isatomicPossiblelatencyproblems (availability)Transaction <PA, M, PB>: requires to controlwhen PBhappensImpacts partitiontoleranceoravailability56

Consistency Models [Vogels 2008]ABCupdate: D0->D1read(D)D0DistributedstoragesystemStrongconsistency: afterthe update completes, anysubsequentaccessfrom A, B, C will return D1Weakconsistency: doesnotguaranteethatsubsequentaccesses will returnD1 -> a number of conditionsneed to bemetbeforeD1 isreturnedEventualconsistency: Special form of weakconsistencyGuaranteesthatif no newupdatesaremade, eventually all accesses will returnD157

Variations of EventualConsistencyCausalconsistency:If A notifies B aboutthe update, B will read D1 (butnot C!)Read-your-writes:A will alwaysread D1afteritsown updateSession consistency:Read-your-writesinside a sessionMonotonicreads:If a process has seenDk, anysubsequentaccess will neverreturnany Diwith i < kMonotonicwrites: guarantees to serializethewrites of thesameprocess58

Database Replicationstorethesamedata on multiple nodes in order to improvereliability, accessibility, fault-toleranceSingle masterMultimasterOptimisticreplicationrelaxedconsistency1-copy consistencyOptimisticstrategies = lazyreplication

Allowsreplicas to diverge; requiresconflictresolution

Allowdatabeaccessedwithouta-priorisynchronization

Updates arepropagated in thebackground

Occasionalconflictsarefixedaftertheyhappen

Improvedavailability, flexibility, scalabability, butsee CAP theorem59

OptimisticReplication: Elements122221111222111. operationsubmission3. scheduling2. propagation1+21+21+24. conflictresolution5. commitment60Y. Saito, M. Shapiro: OptimisticReplication, ACM ComputingSurveys, 5(3):1-44, 2005

Conflict Resolution & Update PropagationSingle masterThomas writeruleDividingobjects, ...Vector clocksApp-specificorderingorpreconditionsProhibitIgnoreReduceSyntacticSemanticDetect & repair61Epidemicinformationdissemination

Updates pass throughthesystemlikeinfectiousdiseases

Pairwisecommunication: a sitecontactsothers (randomlychosen) and sends ist information, e.g. aboutupdates

All sitesprocessmessages in thesame way

Proactivebehaviour: no failurerecoverynecessary!

Basic approaches:anti-entropy, rumor mongering, ...OutlineQuery & Programming ModelLogical Data ModelVirtuali-zationMulti-TenancyService Level AgreementsStorage ModelDistributedStorageReplicationSecurity62

The Notion of QoS and PredictabilityService Level Agreementlegal parttechnical partService Level ObjectivesSpecificmeasurablescharacteristics; e.g. importance, performancegoals

fees, penalties, ...Common understandingaboutservices, guarantees, responsibilities63Application Server / middlewareDBMSOS / Hardware

TechniquesforQoS in Data Management64ProvidesufficientresourcesCapacityplanning: „Howmuchboxesforcustomer X?“Cost vs. Performance tradeoffShieldingDedicated (virtual) systemforcustomersScalability? Costefficiency?SchedulingOrderingrequests on priorityAt whichlevel?

Workload ManagementPurpose:achieveperformancegoalsforclasses of requests (queries, transactions)ResourceprovisioningAspects:Specification of service-levelobjectivesWorkloadclassification and modelingAdmissioncontrol & schedulingStaticpriorization: DB2 Query Patroller, Oracle Resource Manager, ...Goal-orientedapproachesEconomicapproachesUtility-basedapproaches65

Workload CharacteristicsFunctionalI/O requirements (volume, bandwidth)CPUDegree of parallelismResponse times?Throughput?…Non-FunctionalAvailabilityReliabilityDurabilityScalability…66

WLM: Modelclassesworkload classificationMPLresultadmission control &schedulingtransactionresponse timeAdmission control: limit the number of simultanously executing requests (multiprogramming level = MPL)Scheduling: ordering requests by priority67

Utility FunctionsUtility function = preferencespecificationmappossiblesystemstates (e.g. resourceprovisioning to jobs) to a real scalarvalueRepresentsperformancefeature (response time, throughput, ...) and/oreconomicvalueGoal: determinethemostvaluablefeasiblestate, i.e. maximizeutility

Explorespace of alternative mappings (searchproblem)

Runtimemonitoring and controlutilityresponse time68Kephart, Das: Achievingself-management via utilityfunctions. IEEE Internet Computing 2007

WorkloadModeling & PredictionGoal: predictresourcerequirementsfor a givenworkload, i.e., find correlationbetweenqueryfeatures and performancefeaturesApproaches: regression, correlationanalysis, KernelCanonical CAqueryplans/job descr.jobfeaturematrixquery planprojectionKCCAperformancestatisticsperformancefeaturematrixperformanceprojectionGanapathi et al.: Predicting Multiple MetricsforQueries: BetterDecisionsEnabledbyMachineLearning. ICDE 2009Prediction:

Calculate job coordinates in query plan projectionbased on job featurevector

Inferjob‘scoordinates on theperformanceprojection69

Database as a Service - Tutorial @ICDE 2010

More Related Content

What's hot

Viewers also liked

Similar to Database as a Service - Tutorial @ICDE 2010

Recently uploaded

Database as a Service - Tutorial @ICDE 2010

Editor's Notes