Using Azure for Data Crunching - Lup Yuen Lee (NCS)


Published on

Presentation by Lup Yuen Lee (NCS) at "MSDN Presents: Windows Azure Platform" Event (Apr 13, 2010) .

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Using Azure for Data Crunching - Lup Yuen Lee (NCS)

  1. 1. Lee Lup Yuen Principal Consultant Chief Architect Office 13 Apr 2010 Case Study: Using Azure for data crunching
  2. 2. <ul><li>I have a huge database of (say) book titles </li></ul><ul><li>I want to pre-generate a list of search suggestions for my search UI </li></ul><ul><li>Needs plenty of CPU and RAM </li></ul><ul><li>I want it fast and cheap </li></ul><ul><li>Why use Azure? </li></ul><ul><ul><li>On demand computing + storage </li></ul></ul><ul><ul><li>Ready-to-use .NET and SQL infra </li></ul></ul>The problem 04/18/10
  3. 3. The current database <ul><li>Database consists of titles, keywords and other metadata </li></ul><ul><li>SQL Server 2008 </li></ul><ul><li>2.9 GB of data </li></ul><ul><li>4.3 million records </li></ul>04/18/10
  4. 4. Migrating the database to SQL Azure <ul><li>What’s different about SQL Azure? </li></ul><ul><ul><li>SQL Server 2008 R2 can be used for remote mgmt </li></ul></ul><ul><ul><li>Some GUI mgmt functions not supported, need to use T-SQL commands </li></ul></ul><ul><ul><li>Need to open the Azure Firewall to allow remote SQL access </li></ul></ul><ul><ul><li>All tables must have a clustered index </li></ul></ul><ul><ul><li>CLR and Table Partitioning not supported </li></ul></ul><ul><ul><li>More details: </li></ul></ul><ul><li>Copy data using SSIS Import/Export Data Wizard </li></ul><ul><ul><li>Choose the “.NET Framework Data Provider for SQL Server” </li></ul></ul>04/18/10
  5. 5. SQL Azure 04/18/10 2.9 GB database, 4.3 million records uploaded to Azure in about 1 hour
  6. 6. SQL Azure Firewall 04/18/10 Add your firewall rule
  7. 7. Using SQL Server Mgmt Studio with Azure 04/18/10 Azure SQL Server Local SQL Server
  8. 8. The current app <ul><li>Based on .NET 3.5; 700 lines of code </li></ul><ul><li>Scans titles and keywords for substrings and counts the number of occurrences of each substring </li></ul><ul><li>Brute force approach </li></ul><ul><ul><li>100% CPU utilisation </li></ul></ul><ul><ul><li>Can use more than 10 GB of runtime memory </li></ul></ul>04/18/10
  9. 9. Migrating the app to Windows Azure <ul><li>What’s different about .NET on Azure? </li></ul><ul><ul><li>SQL works OK, need to use Azure connection string like;Database=xxx;User ID=xxx@xxx;Password=xxx;Trusted_Connection=False;Encrypt=True; </li></ul></ul><ul><ul><li>Filesystem Support: Requires virtual drive configuration </li></ul></ul><ul><ul><li>Beware of unmanaged code and access to local resources </li></ul></ul><ul><li>Just follow the steps at </li></ul><ul><ul><li>Download and install the Azure SDK </li></ul></ul><ul><ul><li>Create an Azure solution in Visual Studio </li></ul></ul><ul><ul><li>Copy your code into the Worker Role, Web Role or WCF Role </li></ul></ul><ul><ul><li>Configure the Virtual Machine for ExtraLarge if necessary </li></ul></ul>04/18/10
  10. 10. Visual Studio with Azure SDK 04/18/10 ExtraLarge VM: 8 CPU cores 15 GB memory 2,000 GB disk space Azure Project WCF App Web App Worker App
  11. 11. Deploying and running the Azure app <ul><li>Test the app locally using the Azure SDK </li></ul><ul><li>Use Visual Studio to build and publish the app </li></ul><ul><li>Go to the Azure website, create a hosted service </li></ul><ul><li>Upload the *.cskpg and *.cscfg files </li></ul><ul><li>Click “Run” </li></ul>04/18/10
  12. 12. Windows Azure Hosted Service 04/18/10
  13. 13. Deploying a hosted service 04/18/10
  14. 14. Azure Performance <ul><li>Comparing Azure with my low-cost server </li></ul><ul><ul><li>Core i7 3 GHz, 12 GB RAM, Win2008 R2 64-bit, SQL2008 R2 Nov CTP 64-bit </li></ul></ul><ul><ul><li>S$3,000 from Sim Lim computer shop </li></ul></ul>04/18/10 100% RAM utilisation Processing time for My Server Azure 500K records 0.5 hours 0.6 hours 1M records 1.5 hours 2 hours 2M records SQL Timeout 6.5 hours 4M records (To be provided) (To be provided)
  15. 15. Azure Pricing <ul><li>Source: </li></ul>04/18/10 Compute US$0.12 / hour for the SMALL instance US$0.24 / hour for the MEDIUM instance US$0.48 / hour for the LARGE instance US$0.96 / hour for the EXTRA LARGE instance Storage US$0.15 / GB stored/month US$0.01 / 10K storage transactions SQL Azure Web Edition – Up to 1 GB relational database = US$9.99 Business Edition – Up to 10 GB relational database = US$99.99 Data Transfers US$0.10 in / US$0.15 out / GB for North America and Europe US$0.30 in / US$0.45 out / GB for Asia Pacific Inbound data transfers during off-peak times through June 30, 2010 are at no charge. Prices revert to our normal inbound data transfer rates after June 30, 2010
  16. 16. My Azure Bill <ul><li>Total Bill: US$ 122.88 </li></ul>04/18/10
  17. 17. Sample Detailed Bill 04/18/10
  18. 18. Lessons learnt <ul><li>Test locally before deploying to cloud </li></ul><ul><ul><li>Deployment may take a few minutes </li></ul></ul><ul><li>Migrating apps and data to Azure can be fast and easy </li></ul><ul><ul><li>Mine took only 2 days </li></ul></ul><ul><li>Azure is cost-effective for my app </li></ul><ul><ul><li>Costs less than a low-cost server </li></ul></ul>04/18/10