1. INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCING
An Improved Storage Availibility in Large Scale Data CentersAn Improved Storage Availibility in Large Scale Data Centers
3. 3
INTRODUCTIONINTRODUCTION
Big data
● broad term for large datasets
● Business technology for modern enterprises
● Accuracy leads to redused risks
● Storage is a critical component
● Stored in disks
4. 4
STORAGE SYSTEMS TASKSSTORAGE SYSTEMS TASKS
● High priority foreground
tasks
● Low priority background
tasks
Storage system Tasks
Foreground tasks Background tasks
6. 6
INEFFICIENCIES OF EXISTING SYSTEMINEFFICIENCIES OF EXISTING SYSTEM
● Time consuming
● Data loss
● Inefficient Storage Availibility
● Failure Induced optimization
● Does not exploit the predictable nature
● Passive
7. 7
INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCING
● Dynamically captures data popularity
● Exploits temporal and spatial access locality
● Balance between background tasks workflow and user I/O
requests
● Portable
8. 8
OPTIMIZATION SCHEMEOPTIMIZATION SCHEME
EXISTING SYSTEM
● Reactive Optimization
● Request Based
● Exploits Temporal Locality
PROPOSED SYSTEM
● Proactive Optimization
● Zone Based
● Exploits both temporal and
spatial locality
10. 10
ACCESS LOCALITYACCESS LOCALITY
Temporal locality
● Repeated data access within
small time
● Request Based Optimization
Spatial locality
● Clustered data access within
small storage areas
● Zone Based Optimization
Access Locality
Temporal locality Spatial locality
12. 12
DESIGN OF IDODESIGN OF IDO
MAIN THREE OBJECTIVES
● Improving the storage availibility
● Improving the I/O performance
● Providing high portality
14. 14
IDO FUNCTIONAL MODULESIDO FUNCTIONAL MODULES
Hot Zone Identifier
Data Migrator
Request Distributor
Task
Predictor
Data
Reclaimer
● Hot Zone Identification
● Task Prediction
● Request Distribution
● Data Migration
● Data Reclamation
15. 15
KEY DATA STRUCTURESKEY DATA STRUCTURES
ZONE_TABLE
● Num
● Popularity
● Flag
D_MAP
● D_offset
● S_offset
●
Len
16. 16
HOT DATA IDENTIFICATIONHOT DATA IDENTIFICATION
THREE DESIGN ISSUES
● By exploiting the spatial
locality of workloads
● By exploiting the temporal
locality of requests
● By implementing intelligent
modules datastructures
17. 17
PROACTIVE DATA MIGRATIONPROACTIVE DATA MIGRATION
● Hot zone identified
● Task Predictor detects task
● Data Migrated
● Flag set to 01
● RAID Reconstructed
● Flag set to 10
● RAID Reclaimed
● Corresponding D_map deleted
18. 18
IMPROVED STORAGE AVAILIBILITY FOR I/OIMPROVED STORAGE AVAILIBILITY FOR I/O
I/O read request
● IDO determines target data
zone
● Read request issued to
degraded/surrogate RAID set
● Popularity updated
● IDO checks D_map
I/O write request
● Checks D_map for write
request hits
● D_map updated
● Sequentially written to
surrogate RAID set
21. 21
CONCLUSIONCONCLUSION
● Proactive optimisation accelerates low priority background
tasks
● Zone Based approach boosts the performance of low priority
background tasks
● Designed and implemented a proactive zone based
optimisation to outsource data
22. 22
REFERENCESREFERENCES
● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Proactive Data Migration
for Improved Storage Availability in Large-Scale Data Centers. IEEE
Transactions on Computers, 2015.
● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Improving Availability of
RAID-Structured Storage Systems by Workload Outsoucing. IEEE
Transactions on Computers, 2011.
● S. Wu, B. Mao, D. Feng, and J. Chen. Availability-Aware Cache
Management with Improved RAID Reconstruction Performance. In
CSE’10, Dec. 2010.
● L. Xiang, Y. Xu, John C. S. Lui, and Q. Chang. Optimal Recovery of
Single Disk Failure in RDP Code Storage Systems. In SIGMETRICS’10,
Jun. 2010.