WINDOWS AZURE MARKETPLACEBusiness ModelBase meter: $ per transaction (query or API call), charged per transaction or monthly subscriptions 1. Trial Usage: Try before you buy 2. Per Transaction: Pay-as-you-grow transaction based tiers 3. Monthly Subscription: Subscribe to the dataset for unrestricted accessRevenue sharing MS(20%) and content providers (80%)Free offers – no cost
WINDOWS AZURE MARKETPLACE MOMENTUMRegistered Users: ~50,000 (double digit growthmonth over month) (38 countries)Subscriptions: 70,000+ (double digit growth monthover month)Providers: 500+Datasets: 130+ across 15 categories, Apps: 600+Data Quality Services, Microsoft Translator, BingSearch API
HOW TO ACCESS THE DATA?Access from any platform or any app:Native OData-based APIs support Query language over HTTP Open standard protocol (www.odata.org) Native service reference usage in Visual Studio Downloadable proxy classes for fixed function servicesURL-based query supportAuthentication and authorization:Live ID for marketplace access HTTP Basic auth over SSL with account key
WHY DATA MOVEMENT IS SO CRITICAL IN AZURE?It is a common requirement for applications running in the cloudMay have impact onInitial production ramp up (on premises to Azure)Ongoing operations (Azure to Azure)Data protection and recovery (Azure to Azure, Azure to on premises)Data lifecycle management, archiving, pruning (Azure to Azure, Azure to on premises)Moving data between on premises and cloud, and between cloudservices is slower*
BULK COPY PROGRAMFast option to move data in and out from Azure SQL DatabaseCan run on premises, in a web/worker role or in a Virtual machineIf used from on premises, expect potential latency issuesMinimum deployment requirements (SNAC)Can be easily automated, scripted or part of a batch programCan read/write data to/from a local disk, or from an attached persisted .vhd,consider IOPs performance requirements when choosing between the variousoptionsCan run parallel import/export streams from/to the same set of objects toreduce time
SQLBULKINSERT CLASS IN ADO.NETProvide a way to programmatically move data into SQLDatabases from .NET codeWrapper around bulk insert APIsCan run on premises, in a web/worker role or in a VM, butlike other options proximity with the target service willprovide the best performanceAvailable since forever in ADO.NET
SQLBULKINSERT CLASS IN ADO.NETCan be easily integrated with components like TransientFault Handling App Block to provide robust connectionmanagement and support throttling or other issuesProvides all the other benefits of BCP.EXE, like the ability todefine batch size and to run multiple insert streams inparallel
ADDITIONAL CONSIDERATION ON BATCHING DATA INSERTSExecuting multiple insert operations in a single roundtrip is a way tominimize the increased network latency between Windows Azurecompute nodes and SQL Azure databasesBulk Copy APIs provide best performance results over 100 rows inserted,while produce a significant overhead (3.5x) when used with a single rowinsertionBatching multiple INSERT statements in a single Command Text providedgood performance between 1 and 10 records, while produce animportant overhead going to 100 or 1000 records inserted in a singleroundtrip (more than 3x)
ADDITIONAL CONSIDERATION ON BATCHING DATA INSERTSUsing a Table Valued Parameter approach in a “INSERTINTO…. SELECT FROM @TVP” query statement providethe consistently good result results in batches of 1-10-100-1000 rows per roundtrip
SQL SERVER INTEGRATION SERVICESRich and mature data transformation framework and pipelineWell known to traditional DBAs and DevsMany useful standard tasks and capabilities included out of the boxOfficially supported in WA VMsIt works on Web/Worker Roles, but not supportedAutomated package creation and deployment makes SSIS aninteresting component to implement complex data movementsolution in a hybrid scenario
BACPAC FORMAT AND IMPORT/EXPORT SERVICEBacpac format provides a way to logically represent data models,database objects (t-sql code, indexes,etc) with version support and otherbenefits, together with a JSON serialized instance of database rows.The Azure SQL Database Import/Export service uses the Bacpac format tomove data from/to a database and the Azure Blob Storage from themanagement portal or a scriptable service endpoint.Bacpac files can be then easily downloaded on premises and importedinto a SQL Server instance, or being manipulated through aprogrammatic API.
BACPAC FORMAT AND IMPORT/EXPORT SERVICEImport/Export service doesnt provide a transactionally consistent copy of yourSQL Database, so you can either stop users workload or execute thisoperation n a copy of your database executed with the ACREATE DATABASEAS COPY OF option.This option may seems the slowest one, compared to a raw export/importoperation, but youn need to consider that it performs several important tasksthat youll need to execute anyway, like rebuilding a complex indexingstrategy, compress exported data and copy it t the Azure storage.If your data movement needs involve some of these operations,Import/Export service may be your option.
GRADE OF THE STEEL: AZURE DATA MOVEMove data fast:Minimize downtime window finding thefastest and more reliable data movementmechanism availableWarning: based on previous assumptionsthis process is non-deterministic innatureMove Data Reliably:May fail in between move phases as wellCheck for consistency and implementretry logic is necessary
KEY DATA MOVEMENT SCENARIOSInitial data loading and migrationData synchronizationBackup/restore for data protection and disaster recovery“Whoops” errors protection and recoverySharding / Re-Sharding
INITIAL DATA LOADING AND MIGRATIONMove existing on premises relational data to Azure SQL DatabasesMay reside in SQL Server or in a competitive productMany existing tools available for this taskImport/Export Service, BCP.EXE, SSISRecommended approach is to export data out of the on premisesdatabases, compress and copy to a shared storage on Azure, andrun your import operations from Azure roles to limit latency impactParallel loading streams can help performance
DATA SYNCHRONIZATIONAfter an initial data loading, continuous or scheduled synchronizationactivities between on premises databases and cloud may be requiredData Sync Service is a cloud based solution, build on top of the Synch-FX,that can hide most of the complexities and provide a code less solutiondedicated to DBAs and system engineersDepending on the complexity of the sync flow, custom solution leveragingchange data capture on premises or other differencing approaches canbe appliedThis scenario is critical for Hybrid Cloud implementations
MOVING CLOUD GENERATED DATA TO ON PREMISESThe ability to move cloud generated data from Azure SQL Databases toon premises systems can be seen as a mono-directional version of datasynchronizationDepending on the complexity of customer requirements, this can beimplemented by taking a “snapshot” of the production database, movingthis into the Azure Storage, to be downloaded on premises and loadedinto a traditional system (e.g. a data warehousing system)If customer requirements will increase, a more invasive approach could bebased on differencing data movement, limited only to the data that havebeen modified over time
AZURE SQL DATABASE BACKUP / DR OPTIONSBackup/restore for data protection and disaster recovery“Whoops” errors protection and recoveryPITR – Point In Time Restore (Private CTP)14 days rolling database backups automatically, in addition to local 3 or 4 replicas per dbCurrently: restores the entire physical DB + all logical dbs, then extracts your logical dbDBCopyLimited scaling optionsMust complete within 24 hoursDAC Import/Export ServicesService runs in worker rolesScale by adding more roles to support large number of db backups per hour~100K dbs, avg ~300MB each in 6 hours using about 200 small instancesBackup file will be geo-replicated (Windows Azure Blob Storage)NOTE: It does not guarantee transactional consistency – how to solve?
SHARDING / RE-SHARDINGTake an existing monolithic database and move data into asharded/partitioned data tierInitial sharding operation, source database can be on premises or already in AzureSQL DatabasesSplit an existing shard into one or more destination databases forscaling out or data management purposesUsually a cloud to cloud operation, may become a recurring one based on the applicationand workload requirementsCould be ideally performed online, but if the application cantolerate some downtime it will be much easier to implement