• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Using Azure for Data Crunching - Lup Yuen Lee (NCS)

Using Azure for Data Crunching - Lup Yuen Lee (NCS)



Presentation by Lup Yuen Lee (NCS) at "MSDN Presents: Windows Azure Platform" Event (Apr 13, 2010) .

Presentation by Lup Yuen Lee (NCS) at "MSDN Presents: Windows Azure Platform" Event (Apr 13, 2010) .



Total Views
Views on SlideShare
Embed Views



4 Embeds 53

http://innovativesingapore.com 30
http://www.innovativesingapore.com 19
http://spiffy.sg 3
http://spiffy1.capturepagedesigns.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Using Azure for Data Crunching - Lup Yuen Lee (NCS) Using Azure for Data Crunching - Lup Yuen Lee (NCS) Presentation Transcript

    • Lee Lup Yuen Principal Consultant Chief Architect Office 13 Apr 2010 Case Study: Using Azure for data crunching
      • I have a huge database of (say) book titles
      • I want to pre-generate a list of search suggestions for my search UI
      • Needs plenty of CPU and RAM
      • I want it fast and cheap
      • Why use Azure?
        • On demand computing + storage
        • Ready-to-use .NET and SQL infra
      The problem 04/18/10
    • The current database
      • Database consists of titles, keywords and other metadata
      • SQL Server 2008
      • 2.9 GB of data
      • 4.3 million records
    • Migrating the database to SQL Azure
      • What’s different about SQL Azure?
        • SQL Server 2008 R2 can be used for remote mgmt
        • Some GUI mgmt functions not supported, need to use T-SQL commands
        • Need to open the Azure Firewall to allow remote SQL access
        • All tables must have a clustered index
        • CLR and Table Partitioning not supported
        • More details: http://msdn.microsoft.com/en-us/library/ff394102.aspx
      • Copy data using SSIS Import/Export Data Wizard
        • Choose the “.NET Framework Data Provider for SQL Server”
    • SQL Azure 04/18/10 2.9 GB database, 4.3 million records uploaded to Azure in about 1 hour
    • SQL Azure Firewall 04/18/10 Add your firewall rule
    • Using SQL Server Mgmt Studio with Azure 04/18/10 Azure SQL Server Local SQL Server
    • The current app
      • Based on .NET 3.5; 700 lines of code
      • Scans titles and keywords for substrings and counts the number of occurrences of each substring
      • Brute force approach
        • 100% CPU utilisation
        • Can use more than 10 GB of runtime memory
    • Migrating the app to Windows Azure
      • What’s different about .NET on Azure?
        • SQL works OK, need to use Azure connection string like Server=tcp:xxx.database.windows.net;Database=xxx;User ID=xxx@xxx;Password=xxx;Trusted_Connection=False;Encrypt=True;
        • Filesystem Support: Requires virtual drive configuration
        • Beware of unmanaged code and access to local resources
      • Just follow the steps at http://msdn.microsoft.com/en-us/library/dd179367.aspx
        • Download and install the Azure SDK
        • Create an Azure solution in Visual Studio
        • Copy your code into the Worker Role, Web Role or WCF Role
        • Configure the Virtual Machine for ExtraLarge if necessary
    • Visual Studio with Azure SDK 04/18/10 ExtraLarge VM: 8 CPU cores 15 GB memory 2,000 GB disk space Azure Project WCF App Web App Worker App
    • Deploying and running the Azure app
      • Test the app locally using the Azure SDK
      • Use Visual Studio to build and publish the app
      • Go to the Azure website, create a hosted service
      • Upload the *.cskpg and *.cscfg files
      • Click “Run”
    • Windows Azure Hosted Service 04/18/10
    • Deploying a hosted service 04/18/10
    • Azure Performance
      • Comparing Azure with my low-cost server
        • Core i7 3 GHz, 12 GB RAM, Win2008 R2 64-bit, SQL2008 R2 Nov CTP 64-bit
        • S$3,000 from Sim Lim computer shop
      04/18/10 100% RAM utilisation Processing time for My Server Azure 500K records 0.5 hours 0.6 hours 1M records 1.5 hours 2 hours 2M records SQL Timeout 6.5 hours 4M records (To be provided) (To be provided)
    • Azure Pricing
      • Source: http://www.microsoft.com/windowsazure/faq/#pricing
      04/18/10 Compute US$0.12 / hour for the SMALL instance US$0.24 / hour for the MEDIUM instance US$0.48 / hour for the LARGE instance US$0.96 / hour for the EXTRA LARGE instance Storage US$0.15 / GB stored/month US$0.01 / 10K storage transactions SQL Azure Web Edition – Up to 1 GB relational database = US$9.99 Business Edition – Up to 10 GB relational database = US$99.99 Data Transfers US$0.10 in / US$0.15 out / GB for North America and Europe US$0.30 in / US$0.45 out / GB for Asia Pacific Inbound data transfers during off-peak times through June 30, 2010 are at no charge. Prices revert to our normal inbound data transfer rates after June 30, 2010
    • My Azure Bill
      • Total Bill: US$ 122.88
    • Sample Detailed Bill 04/18/10
    • Lessons learnt
      • Test locally before deploying to cloud
        • Deployment may take a few minutes
      • Migrating apps and data to Azure can be fast and easy
        • Mine took only 2 days
      • Azure is cost-effective for my app
        • Costs less than a low-cost server