DATA STORAGE.Introduction to enterprise data storage.Data storage management.File Systems.Cloud file system.Cloud data stores.Using Grid for data storage.
2. Introduction to enterprise data storage
DirectAttached Storage (DAS)
Storage Area Network(SAN)
NetworkAttached Storage(NAS)
DirectAttached Storage (DAS)
Basic storage system
Used for build SAN & NAS
Performance of SAN & NAS depends on DAS
3. Storage Area Network(SAN)
Used for multiple host to connect single storage device
Simultaneous access not permitted- used for clustering
environment
Technologies- Fibre channel FC, iSCSI,AoE
(ATA over ethernet)
NetworkAttached Storage(NAS)
File level storage
File server
Advantage – sharing a single volume by multiple host
4. Data storage management
Tiered Storage
Data storage management
Storage resource management tools:
Configuration tools – handle set up of storage resources
Provisioning tools- define and control access of storage resources
Measurement tools- analyze performance
Storage management process
Data storage management tools rely on policies
Three areas in storage management
Change management
Performance and capacity planning
Tiering
5. Data storage challenges
Massive data demand
Performance barrier
Power consumption and cost
Unified Storage
Combination of NAS & SAN- NUS (Network unified
storage)
Accessed by single & multiple hosts
Advantage- reduced cost, supports fiber channel & iSCSI
6. File Systems
Fat file system
Planned for system with very small RAM & small disks
Require less system resources
NTFS
Much simpler than FAT.
System areas customized while using files
Security incorporated
Not apt for small sized disks
7. Cloud file system
Considerations:
Must sustain basic file functionality
Should be an open source
Should be grown up enough
Should be shared
Should be scalable
Honest data protection
8. Ghost file system
Used inAWS (Amazon web services)
High redundant elastic mounted, cost effective, standards
based file system
Provides fully featured scalable and stable cloud file system
Benefits of Ghost file system
Elastic and cost effective
Multi-region redundancy
Highly secure
No administration
Anywhere
9. Features of Ghost file system
Mature elastic file system
All files & metadata can be duplicated
WebDav for standard mounting on any server or client
FTP access
Web interface
File name search
Side loading of files
10. Gluster File system
Is an open source
Distributed file system
Clusters storage devices over network, aggregating disk,
memory resources & managing data as a single unit
Supports client with valid IP address
Attributes – Scalability, performance, high availability, global
name place, etc…
11. Hadoop File System
Distributed file system to run on a commodity hardware
Files are stored in blocks from 64 mb-1024 mb
Blocks distributed across cluster & replicated for fault
tolerance
Xtreem FS:A distributed and replicated file system
Is a distributed , replicated & open source
Allows to mount & access files viaWWW
Replicate files to reduce network congestion, latency &
increase data availability
12. Kosmos File System
Gives high performance with availability and reliability
Deployed in c++ using standard system components
Incorporated with Hadoop and Hypertable
Cloud FS
Is a distributed file system to solve problems
Cloud FS is based on Gluster FS & supported by Red hat &
hosted by Fedora
13. Cloud data stores
Is a data repository – data stored as objects
Includes data repositories, flat files.
Types
Relational databases
Object oriented databases
Operational databases
Schema less data stores
Paper files
Data files
14. Distributed data stores
Is like a distributed database
Non- relational databases-searches quickly over a large
multiple nodes
Eg: Google’s big table,Amazon’s dynamo,Window’s azure.
Types of data stores
Big table – is a compressed , high performance, properietary
storage system
Developed in 2004, used in google applications like google
earth, google map, google book search…
Advantage – scalability,better performance control
Bigtable charts 2 random string values- row & column key and
time stamp in to associated random byte array.
15. Other similar S/W:
ApacheAccumulo,Apache Cassandra, Hbase, KDI
Dynamo:A Distributed Storage System
Is a proprietary key value structured storage system or a
dispersed data store
Acts as databases & also distributed hash tables
Most powerful relational database
Is a distributed storage system not a relational database
Advantage- responsive,consistent
16. Using Grid for data storage
Grid storage for grid computing
It virtualizes heterogeneous and remotely located components into a single
system
It allows sharing of computing & data resources for multiple work loads &
enable collaboration
Grid computing uses NAS type of storage
To set unique demands- storage for grid must be flexible
Grid Oriented Storage-GOS
Is a dedicated data storage architecture connected directly to a computational grid
Acts as data bank
Successor of NAS
Deals with long distance, heterogeneous & single image file operations
Acts as a file server