Cloud storage


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cloud storage

  1. 1. the cloudDATA & STORAGE
  2. 2. Outline Distributed file systems Introduction to Big Data Storage paradigms (RDBMS, NoSQL, and NewSQL)
 Writing an application on top of distributed storage(Cassandra)
  3. 3. file systemThe purpose of a file system is to: Organize and store data Support sharing of data among users and applications Ensure persistence of data after a reboot Examples include FAT, NTFS, ext3, ext4, etc.
  4. 4. Distributed file system Self-explanatory: the file system is distributed across manymachines The DFS provides a common abstraction to the dispersed files Each DFS has an associated API that provides a service toclients, which are normal file operations, such ascreate, read, write, etc. Maintains a namespace which maps logical names to physicalnames Simplifies replication and migration Examples include the Network file system (NFS), Andrew file system(AFS), Google file system (GFS), Hadoop Distributed file system(HDFS) etc.
  5. 5. Introduction to GFS Designed by Google to meet its massive storage needs Shares many goals with previous distributed file systems such asperformance, scalability, reliability, and availability At the same time, design driven by key observations of theirworkload and infrastructure, both current and future
  6. 6. Design Goals Failure is the norm rather than the exception: The GFS must constantlyintrospect and automatically recover from failure The system stores a fair number of large files: Optimize for large files, onthe order of GBs, but still support small files Most applications perform large, sequential writes that are mostly appendoperations: Support small writes but do not optimize for them Most operations are producer-consume queues or many-way merging:Support concurrent reads or writes by hundreds of clients simultaneously Applications process data in bulk at a high rate: Favor throughput overlatency
  7. 7. Files Files are sliced into fixed-size chunks 64MB Each chunk is identifiable by an immutable and globally unique64-bit handle Chunks are stored by chunkservers as local Linux files Reads and writes to a chunk are specified by a handle and a byterange Each chunk is replicated on multiple chunkservers 3 by default
  8. 8. Architecture Consists of a single master andmultiple chunkservers The system can be accessed bymultiple clients Both the master andchunkservers run as user-spaceserver processes on commodityLinux machines
  9. 9. Master In charge of all filesystem metadata Namespace, access control information, mapping between files andchunks, and current locations of chunks Holds this information in memory and regularly syncs it with a log file Also in charge of chunk leasing, garbage collection, and chunkmigration Periodically sends each chunkserver a heartbeat signal to checkits state and send it instructions Clients interact with it to access metadata but all data-bearingcommunication goes directly to the relevant chunkservers As a result, the master does not become a performance bottleneck
  10. 10. Master: Consistency Model All namespace mutations (such as file creation) are atomic as theyare exclusively handled by the master Namespace locking guarantees atomicity and correctness The operation log maintained by the master defines a global totalorder of these operations
  11. 11. Mutation Operations Each chunk has many replicas The primary replica holds a lease from the master It decides the order of all mutations for all replicas
  12. 12. Write Operation Client obtains the location of replicas andthe identity of the primary replica from themaster It then pushes the data to all replica nodes The client issues an update request toprimary Primary forwards the write request to allreplicas It waits for a reply from all replicas beforereturning to the client
  13. 13. Record Append Operation Append location chosen by the GFS and communicated to theclient Primary forwards the write request to all replicas It waits for a reply from all replicas before returning to the client If the records fits in the current chunk, it is written and communicatedto the client If it does not, the chunk is padded and the client is told to try the nextchunk Performed atomically
  14. 14. Chunk Placement Put on chunkservers with below average disk space usage Limit number of “recent” creations on a chunkserver, to ensure thatit does not experience any traffic spike due to its fresh data For reliability, replicas spread across racks
  15. 15. Stale Replica Detection Each chunk is assigned a version number Each time a new lease is granted, the version number isincremented Stale replicas with outdated version numbers, are simply garbagecollected
  16. 16. Garbage Collection A lazy reclamation strategy is used by not reclaiming chunks atdelete time Each chunkserver communicates the subset of its current chunksto the master in the heartbeat signal Master pinpoints chunks which have been orphaned Chunks become garbage when they are orphaned The chunkserver finally reclaims that space
  17. 17. Introduction HDFS Open-source clone of GFS Comes packaged with Hadoop Master is called the NameNode and chunkservers are called DataNodes Chunks are known as blocks Exposes a Java API and a command-line interface
  18. 18. Command-line API Accessible through: bin/hdfs dfs –[command args] Useful commands:cat, copyFromLocal, copyToLocal, cp, ls, mkdir, moveFromLocal, moveToLocal, mv, rm, etc*.*
  20. 20. Today, Government agencies at the Federal, State and Local level areconfronting the same challenge that commercial organizations have beenstruggling with in recent years: how to best capture and utilize the increasingamount of data that is coming from more sources than ever before.Problem
  21. 21.  The current framework:the Webmultidisciplinaryand complex
  22. 22. Big DataLarge datasets whose processing and storage requirements exceed all traditionalparadigms and infrastructure
  23. 23. 3 Vs of Big Data The “BIG” in bigdata isn’t justabout volume
  24. 24. Big data ecosystem Presentation layer Application layer: frameworks + storage Operating system layer Virtualization layer (optional) Network layer (intra- and inter-data center) Physical infrastructure Can roughly be called the “cloud”
  25. 25. More Examples of big data… Index 20 billion web pages a day, Handle in excess of 3 billion search queries daily Provide email storage to 425 million Gmail users Serve 3 billion YouTube videos a day 400 million Tweets everyday In March 2012, the Obama Administration announced the Big Data Research and DevelopmentInitiative, $200 million in new R&D investments, which will explore how Big Data could be used to addressimportant problems facing the government.
  26. 26. Why are they collecting all this data?Target Marketing• To send you catalogs for exactlythe merchandise you typicallypurchase.• To suggest medications thatprecisely match your medicalhistory.• To “push” television channels toyour set instead of your “pulling”them in.• To send advertisements on thosechannels just for you!Targeted Information• To know what you need beforeyou even know you need itbased on past purchasinghabits!• To notify you of your expiringdriver’s license or credit cards orlast refill on a Rx, etc.• To give you turn-by-turndirections to a shelter in case ofemergency.
  27. 27. What problems can be raised with Big Data ?
  28. 28. What is the problem Traditionally, computation has been processor-bound For decades, the primary push was to increase thecomputing power of a single machine – Fasterprocessor, more RAM Distributed systems evolved to allow developers to usemultiple machines for a single job – At computetime, data is copied to the compute nodes
  29. 29.  Getting the data to the processors becomes the bottleneck Quick calculation – Typical disk data transfer rate: 75MB/sec – Time taken to transfer 100GB of data to the processor: approx. 22 minutes!What is the problem
  30. 30.  Failure of a component may cost a lot What we need when job fail? – May result in a graceful degradation ofapplication performance, but entire system does not completely fail –Should not result in the loss of any data – Would not affect the outcome ofthe jobWhat is the problem
  31. 31. RDBMS, NoSQL & NewSQL & Apps
  32. 32. IntroductionData is everywhere and is the driving force behind our livesThe address book on your phone is dataSo is the newspaper that you read every morningEverything you see around you is a potential source of data whichmight be useful for a certain applicationWe use this data to share information and make a more informed decision about different eventsDatasets can easily be classified on the basis of theirstructure Structured Unstructured Semi-structured
  33. 33. Structured Data Formatted in a universally understandable and identifiable way In most cases, structured data is formally specified by aschema Your phone address phone is structured because it has a schema consisting of name, phone number, address, email address, etc. Most traditional databases contain structured data revolving around data laid out across columns and rows Each field also has an associated type Possible to search for items based on their data types
  34. 34. Unstructured Data Data without any conceptual definition or type Can vary from raw text to binary data Processing unstructured data requires parsing and tagging on the fly In most cases, consists of simple log files
  35. 35. Semi-structured Data Occupies the space between the structured and unstructured data spectrum For instance, while binary data has no structure, audio and video files. have meta-data which has structure, such as author, time of creation, etc. Can also be labelled as self-describing structure
  36. 36. Storage
  37. 37. Database Management Systems (DBMS) Used to store and manage data Support for large amounts of data Ensure concurrency, sharing, and locking Security is useful too; to enable fine-grained access control Ability to keep working in the face of failure
  38. 38. Relational Database Management Systems(RDBMS) The most popular and predominant storage system in use Data in different files is connected by using a key field Data is laid out in different tables, with a key field that identifies eachrow The same key field is used to connect one table to another For instance, a relation might have customer ID as key and her details asdata; another table might have the same key but different data, say herpurchases; yet another table with the same key might have a breakdownof her preferences Examples include Oracle Database, MS SQL Server, MySQL, IBM DB2, andTeradata
  39. 39. RDBMS and Structured Data As structured data follows a predefined schema, it naturally maps on to arelational database system The schema defines the type and structure of the data and its relations Schema design is an arduous process and needs to be done before the database can be populated Another consequence of a strict schema is that it is non-trivial to extend it For instance, adding a new attribute to an existing row necessitates adding a new column to the entire table Extremely suboptimal in tables with millions of rows
  40. 40. RDBMS and Semi- and Un-structured Data Unstructured data has no notion of schema while semi-structured dataonly has a weak one Data within such datasets also has an associated type In fact, types are application-centric: It might be possible to interpret afield as a float in one application and as a string in another While it is possible, with human intervention, to glean structure fromunstructured data, it is an extremely expensive task Structureless data generated by real-time sources can change thenumber of attributes and their types on the fly RDBMS would require the creation of a new table each time such achange takes place Therefore, unstructured and semi-structured data does not fit therelational model
  41. 41. NoSQLIts not about not about saying that SQL should never be used, or that SQL isdead
  42. 42. NoSQLis simplyNot Only SQL!Its about recognizing that for some problemsother storage solutions are better suited
  43. 43. NoSQL Database management without relationalmodel, schema free Usually not ACID Eventually consistent data Distributed, fault-tolerant Large amounts of dataLow and predictable response time (latency) Scalability & elasticity (at low cost!) High availability Flexible schemas / semi-structured data
  44. 44. Some NoSQL use cases1. Massive data volumes Massively distributed architecture required to store the data Google, Amazon, Yahoo, Facebook – 10-100K servers2. Extreme query workload Impossible to efficiently do joins at that scale with an RDBMS3. Schema evolution Schema flexibility (migration) is not trivial at large Schema changes can be gradually introduced with NoSQL
  45. 45. Three (emerging) NOSQL categories Key-value stores Based on DHTs / Amazons Dynamo paper Data model: (global) collection of K-V pairs Example: Dynomite, Voldemort, Tokyo BigTable Clones Based on Googles BigTable paper Data model: big table, column families Example: Hbase, Hypertable
  46. 46.  Document databases Inspired by Lotus Notes Data model: collections of K-V collections Example: CouchDB, MongoDBThree (emerging) NOSQL categories…
  47. 47. NoSQL pros/cons
  48. 48. NewSQLNewSQL is a class of modern relational database management systems thatseek to provide the same scalable performance of NoSQL systems forOLTP workloads while still maintaining the ACID guarantees of a traditionalsingle-node database system
  49. 49. NewSQL SQL as the primary interface. ACID support for transactions Non-locking concurrency control. High per-node performance. Parallel, shared-nothing architecture. Radically better scalability andperformance
  50. 50.  A hybrid of traditional RDBMS and NoSQL Scalability and performance of NoSQL and ACID guarantees of RDBMS Use SQL as the primary language Ability to scale out and run over commodity hardware Classified into:1 New Databases: Designed from scratch2 New MySQL Storage Engines: Keep MySQL as interface but replace thestorage engine3 Transparent Clustering: Add pluggable features to existing databases toensure scalability
  51. 51. NewSQL World11:23:18 PM /00
  52. 52. Column Store Database
  53. 53. Why column Store ? Can be significantly faster than row stores for some applications Fetch only required columns for a query Better cache effects Better compression (similar attribute values within a column)Why Column But can be slower for other applications OLTP with many row inserts, ..Store? Long war between the column store and row store camps
  54. 54. Introduction Borrows concepts from both Dynamo and BigTable Originally developed by Facebook but now an Apache open source project Designed for Facebook Chat for efficiently storing, indexing, and searching messages
  55. 55. Design Goals Processing of a large amount of data Highly scalable Reliability at a massive scale High throughput writes without sacrificing read efficiency
  56. 56. Introduction…• Developed by Facebook (inbox), now Apache– Facebook now developing its own version again• Based on Google BigTable (data model) and Amazon Dynamo (partitioning & consistency)• P2P– Every node is aware of all other nodes in the cluster• Design goals– High availability– Eventual consistency (improves HA)– Incremental scalability / elasticity– Optimistic replication
  57. 57. Data model– Same as BigTable– Super Columns (nested Columns) and Super Column Families– column order in a CF can be specified (name, time)• Cluster membership– Gossip – every nodes gossips to 1-3 other nodes about the state of the cluster (merge incominginfo with its own)– Changes in the cluster (node in/out, failure) propagate quickly (LogN)– Probabilistic failure detection (sliding window, Exp(α) or Nor(μ,σ2))• Dynamic partitioning– Consistent hashing– Ring of nodes– Nodes can be “moved” on the ring for load balancing
  58. 58. Cassandra @ Facebook• Inbox search• ca. 2009 - 50 TB data, 150 nodes, 2 datacenters• Performance (production)