Windows Azure Conference 2014
Windows Azure Conference 2014
Data Storage options on Windows
Azure
Govind Kanshi
MTC
Windows Azure Conference 2014
Way to skin cat store
• Hosting options
• What you need to worry about
– Availability
– Perf...
Windows Azure Conference 2014
Hosting option
• Hosted
• Host your own
• What you need to worry about
– Availability
– Perf...
Windows Azure Conference 2014
Hosting option Path
• Hosted (not my headache option)
– No admin – (majority – setup/mainten...
Windows Azure Conference 2014
Hosting Options Path
• Hosted
– No admin – (majority – setup/maintenance)
– Availability – B...
Windows Azure Conference 2014
Offerings
• Relational
– Hosted
• SqlAzure
– Host your own
• Sql Server, Oracle, MySql, Post...
Windows Azure Conference 2014
Availability
• Hosted
– SQLAzure
• Local transparent failover – no direct access to replicas...
Windows Azure Conference 2014
Performance
• Hosted
– Azure provides various options
• SqlAzure premium vs Regular (remove ...
Windows Azure Conference 2014
Scale (Up/Out)
• Hosted
– SqlAzure
• Web/Business – storage vs SqlPremium isolated perf
– HD...
Windows Azure Conference 2014
Management/Monitoring
• Hosted
– API or Dashboard (mostly)
– Everything abstraced – Cost/ope...
Windows Azure Conference 2014
Cost
• Hosted
– Generally easy (volume stored, unit/processed/sent)
– For ISV Billing is sti...
Windows Azure Conference 2014
What to check for in Host your Own
• License portability
• Certification
• Support
• Preferr...
Windows Azure Conference 2014
Why diff kind of store
• Data is complex - struct of struct of maps
• Data is changing the s...
Windows Azure Conference 2014
What kind of data
• What is my scenario
– Caching – Velocity, MemcacheD, Redis, Riak
– Count...
Windows Azure Conference 2014
Where do I store my data - Location
Low latency
Local Memory
Low latency
Shared Memory
Dedic...
Windows Azure Conference 2014
Or another way to think
• Will I write lot of data and need to store & query it
• Will need ...
Windows Azure Conference 2014
How will we get/store the data
• Query
– SQL, LINQ, ORMed (challenge mapping to every langua...
Windows Azure Conference 2014
GuidanceStores Hosted Host your own
Microsoft Non Microsoft/Partner Microsoft/Partner Non Mi...
Windows Azure Conference 2014
End
Windows Azure Conference 2014
Compare them – summary
(evolving)
Key Value Document Column Graph
Persistence-
Json
* * *
AC...
Upcoming SlideShare
Loading in …5
×

Choosing right data store & processing

490 views

Published on

Data is essential to an application. Today Azure provides multiple options to store it. There are various ways to skin the cat based on hosted/host your own and working through performance, availability, scale, licensing etc.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
490
On SlideShare
0
From Embeds
0
Number of Embeds
33
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • how to model your datawhat data access patterns will your application needhow you deal with data integrity/consistency (what happens if two applications will need to access the same data in read/write mode?)what is the final complexity, performance, scalability of the solution based on the decisions you’ve made to the above points.Access Mechanims – REST vs API vs SQL (like mechanisms of cassandara)Joins ? – mostly not – document db allow nesting and transactions, columnar stores allow storing columns together and spreading themOther ways to skinData model – kv, columnAPI – single tuple, rangePartition (ordered/random)Optimized for reads/writesVersions – version/timestampReplication – quorum/filesystemArch – decentralized, hierarchicalData center aware?
  • Choosing right data store & processing

    1. 1. Windows Azure Conference 2014 Windows Azure Conference 2014 Data Storage options on Windows Azure Govind Kanshi MTC
    2. 2. Windows Azure Conference 2014 Way to skin cat store • Hosting options • What you need to worry about – Availability – Performance – Scale... • Where do I store data
    3. 3. Windows Azure Conference 2014 Hosting option • Hosted • Host your own • What you need to worry about – Availability – Performance (more compute/bw/better storage) – Scale (throughput/latency/storage) – Management/Monitoring – Cost
    4. 4. Windows Azure Conference 2014 Hosting option Path • Hosted (not my headache option) – No admin – (majority – setup/maintenance) – Availability – Better and cheaper – Very little planning/spend the size of mc, resources – Focus on application not on admin/mgmt. issues
    5. 5. Windows Azure Conference 2014 Hosting Options Path • Hosted – No admin – (majority – setup/maintenance) – Availability – Better and cheaper – Very little planning/spend the size of mc, resources – Focus on application not on admin/mgmt. issues • Host your own – Flexibility (use jobs, use replication, use broker) – Roll your own Availability, Performance, upgrade,patching – Plan your scale, spend – Plan for Admin – have inhouse expertise
    6. 6. Windows Azure Conference 2014 Offerings • Relational – Hosted • SqlAzure – Host your own • Sql Server, Oracle, MySql, Postgres • Non Relational – Hosted • Table Storage – key/value, Blob/Page store • Mongo – Host your own • Cassandra., Mongo, Redis
    7. 7. Windows Azure Conference 2014 Availability • Hosted – SQLAzure • Local transparent failover – no direct access to replicas • Replicas – Remote ? ( ship logs and failover via Traffic manager), Tk bkup • Replicas – Read Only ? – In future (local vs across dc) – Azure Storage • Local transparent failover – no direct access to replicas • Remote replication (no guarantee SLA but usually within minutes) • Host your own – Availability sets • Need to setup Virtual Network • Need to create synch mechanism • Need to setup failover mechanism – AlwaysOn for SQL servers, Other databases need to get it right like SQL Server – Use Azure storage – push backup
    8. 8. Windows Azure Conference 2014 Performance • Hosted – Azure provides various options • SqlAzure premium vs Regular (remove noisy neighbor issue) • Pretty soon other services will distinguish themselves by performance(think H) – SQlAzure premium provides reserved IOPs • Host your own – Choose better compute – Choose better storage • Soon good news on more options – Eod you need to create monitoring, fixing & do planning
    9. 9. Windows Azure Conference 2014 Scale (Up/Out) • Hosted – SqlAzure • Web/Business – storage vs SqlPremium isolated perf – HDInsight • Scaleout vs scaleup of nodes (disruptive) – Table Storage/Azure Blog/Queues - Service Bus(little diff) • Unlimited storage(overall 200TB) – no explicit limit (no scale up sku) • Host your own – Need to plan for provisioning of storage/compute based on offering (redis vs Cassandra vs Hbase). Monitoring/Handling failover etc extra effort.
    10. 10. Windows Azure Conference 2014 Management/Monitoring • Hosted – API or Dashboard (mostly) – Everything abstraced – Cost/operations which matter than os/mem etc – Mostly auto managed/healed with with overall backend taking care of many things – No worries about patch mgmt, backup schedules etc… • Host your own – Roll out your own (time vs what to expose/use/act upon) – Cloud aware SW needed. System Center can do x things – Backend can take care of say compute failover or storage but rest stuff needs to be built upon.
    11. 11. Windows Azure Conference 2014 Cost • Hosted – Generally easy (volume stored, unit/processed/sent) – For ISV Billing is still an exercise – should become better • Host your own – Roll your own – basically what you use is what you pay. – Plus licensing blues – Plus dedicated people(sometimes hierarchy, one to do day-day jobs, another to help business/dev)
    12. 12. Windows Azure Conference 2014 What to check for in Host your Own • License portability • Certification • Support • Preferred usage – Dev/Test vs Production
    13. 13. Windows Azure Conference 2014 Why diff kind of store • Data is complex - struct of struct of maps • Data is changing the shape • Lot of data is collected – scale of storage – Time Series • Sensors • Audit events – Data is schema? • easy to add new fields, and even completely change the structure of a model. • Need query model over shape rather than just key/value or pseudo mapping to Relational world • Low Latency high volume
    14. 14. Windows Azure Conference 2014 What kind of data • What is my scenario – Caching – Velocity, MemcacheD, Redis, Riak – Counters/Speed/Write – Velocity, Redis, Cassandra – Transactions – Database, SQL Azure (federation) – Documents/jsonfied class/shape – MongoDB, RavenDB, Riak * – Write large amount of data with throughput – Cassandra,Azure Storage – Full Text Search – Solr/ElasticSearch, Sphinx – Store data for scale out compute – Hadoop – Store data on specialized Appliance – PDW * Wished we could query shape data rather than fitting in relational world of columns/rows
    15. 15. Windows Azure Conference 2014 Where do I store my data - Location Low latency Local Memory Low latency Shared Memory Dedicated Machine Shared high throughput Storage Shared entity Storage Shared raw, batch long term storage Ref Data Session data Tx Data Tx data Entity data Data Lake/Store everything, In Node Cache Azure Cache Relational DB SQLAzure Relational DB AzureTable HDInsight
    16. 16. Windows Azure Conference 2014 Or another way to think • Will I write lot of data and need to store & query it • Will need very low latency • Can I compromise on consistency • What are my business needs (how fast we are growing), Can I afford to take a break and get/roll in new store
    17. 17. Windows Azure Conference 2014 How will we get/store the data • Query – SQL, LINQ, ORMed (challenge mapping to every language) or REST – Custom (query format, compression,serialization) • Tunable Consistency – Out of 5 nodes only when 3 respond yay – consider written – Out of 5 nodes when 2 respond yay – take that value
    18. 18. Windows Azure Conference 2014 GuidanceStores Hosted Host your own Microsoft Non Microsoft/Partner Microsoft/Partner Non Microsoft Relational SQLAzure Sql Server, Access Oracle, SAP, My Caching Azure Cache Memcache Redis, Memcache K-v/Column store Azure Table Cassandra, Riak, Hbase Document store AzureTable? Mongo MongoDB Graph Store Neo4j VL-Scaleout HDInsight HortonWorks HDP Cloudera? In-Memory DS Azure Cache Redis Streaming/Queue/EAI Azure queue,Notification, Biztalk StreamInsight ,MSMQ, Biztalk Storm, Kafka Long term Azure Storage Build your own Text Azure Table Solr SQL server Solr, Elastic Search
    19. 19. Windows Azure Conference 2014 End
    20. 20. Windows Azure Conference 2014 Compare them – summary (evolving) Key Value Document Column Graph Persistence- Json * * * ACID # # # Query mode API/REST API/REST API SPARQL/Rest/Java Scale Horizontal Horizontal Horizontal Vertical scale Replication Async Async/tunable Tunable NA Schema free * * + * Mapreduce # # # NA Node- Addn/Dln + Manual # * NA Indexing Primary key Attributes # * * :Most of them support, # :specific product support , + :partial support

    ×