The Azure Storage service provides a massively scalable solution for applications that require scalable, durable, and highly available storage for their data. What are your options if you need to get a bunch of data into, out of, or between your Azure Storage accounts? . This talk will offer a quick introduction to AzCopy, a tool built on top of the Azure Storage APIs that provides command-line functionality for moving data into or across Azure Blob, File, and Table storage subscriptions. Its new “cousin”, the Azure Storage Data Movement library – which allows programmatic access to the AzCopy functionality – will also be discussed.
2. Consulting/Training
Cloud-native storage solution for Microsoft Azure
Designed for durability, availability
Pay-as-you-go: amount stored & egress (ingress is free)
Grouped into “accounts”
Up to 500 TB of data stored per account
Up to 100 named accounts per subscription (can be increased via support)
Storage Services for working with Blob, Table, Queue, File
Azure Storage
3. Consulting/Training
REST API
PowerShell, Azure CLI
Azure Storage SDK
1st & 3rd Party GUI Tools
MSFT Storage Explorer List (updated 7/2015): http://j.mp/AzureStorageExplorers
Direct-Ship Drives
Azure Services
Working With Data In Azure Storage
And what if I want to move data between Storage Accounts?
4. Consulting/Training
Command-line utility
Copy data to/from Blob, File, Table storage
Operates with local file system or between storage accounts
Wrapper around .NET Storage API, with parallel execution, journaling,
progress
http://aka.ms/downloadazcopy
Introducing AzCopy
6. Consulting/Training
AzCopy uses a “Journal File”
Default Location:
%SystemDrive%Users%username%AppDataLocalMicrosoftAzureAzCopy
Can be repointed with /z
If the file exists
If the command matches, the journal contents are used to resume
If the command does not match, you are prompted to overwrite the journal
The journal file is deleted upon successful completion of an operation
Resuming Incomplete Operations
8. Consulting/Training
JSON or CSV specified with /PayloadFormat value
CSV download includes a .schema.csv file for each data file
<account name>_<table name>_<timestamp>.<volume index>_<crc>.json
Volume Index = m_n
Can Split based on partition key range with PKRS parameter – allows parallel
export
Can Split based on size with SplitSize parameter (Size in MB, min =32)
m = partition key range index, n = split index (zeros if omitted)
Working with Azure Storage Tables - Export
9. Consulting/Training
JSON only
Uses a manifest file to locate data files and perform validation
Manifest file is created as part of Export
<account name>_<table name>_<timestamp>.manifest
Specify how to handle PK/RK collisions with EntityOperation –
InsertOrSkip, InserOrMerge, InsertOrReplace
Working with Azure Storage Tables - Import
11. Consulting/Training
Based on the core data movement framework that powers AzCopy
Works with Blobs & Files
NuGet & Source (with Samples)
NuGet:
https://www.nuget.org/packages/Microsoft.Azure.Storage.DataMovement
Git:
https://github.com/Azure/azure-storage-net-data-movement
Introducing the Data Movement Library
One VERY interesting sample – “S3ToAzureSample”
12. Consulting/Training
TransferManager
Copy, CopyDirectory, Download, DownlodDirectory, Upload, UploadDirectory
Set Azure Storage coordinates using “standard” Azure Storage API calls
Options (Recursive, SearchPattern, etc.)
TransferContext (ProgressHandler, OverwriteCallback, etc.)
There are some “best practices” you should follow to set up HTTP
communications the way DML needs it
Working With the Data Movement Library
Yes, that’s 5 Petabytes of data storage available per Azure Subscription
Storage Explorers: http://blogs.msdn.com/b/windowsazurestorage/archive/2014/03/11/windows-azure-storage-explorers-2014.aspx
Don’t forget Cloud Explorer in Visual Studio!
Some of the tools are quite powerful, offer multi-threading, etc. AzCopy is free and is useful for build command integration, etc.
Azure Services can generate/transform data from other sources to Azure Storage (Stream Analytics, Data Factory)
Wrapper – discuss “Best Practices” that it wraps/sets
Demo – Upload local FS file or Folder to Blob with AzCopy
Demo – Add SAS (either param or URL)
Demo - Download
Demo – Transfer between storage accounts with AzCopy
Note: Using US College Survey Data (including non-CONUS venues), PK=State, RK=UNITID, Download PKRS=State Abbrevs
TODO – Some GUI code to exercise the API
Build: March 30 – April 1 in San Francisco
Convergence: April 4-7, 2016 in New Orleans
WPC: July 10-14, 2016 in Toronto
Microsoft Ignite: September 26-30, 2016 in Atlanta