Coefficient of Thermal Expansion and their Importance.pptx
Liquid File System For Efficient VM Storage
1. Liquid- A Scalable Deduplication File System For
Virtual Machine Images
Anamika G V(12143630)
S7 CSA
College of Engineering, Cherthala
August 8, 2016
Guided By
Josna Jose
Assistant Professor
Computer Science & Engineering
2. CONTENTS
1 INTRODUCTION
2 VIRTUAL MACHINE
3 DEDUPLICATION
4 EXISTING SYSTEM
5 ISSUES IN VM STORAGE
6 LIQUID SYSTEM ARCHITECTURE
7 DEDUPLICATION IN LIQUID
8 OPTIMIZATIONS ON FINGER PRINT CALCULATION
9 FILE SYSTEM LAYOUT
10 COMMUNICATION AMONG COMPONENTS HEART BEAT
PROTOCOL
11 FAST CLONING FOR VM IMAGES
12 FAULT TOLERANCE
13 GARBAGE COLLECTION
14 ADVANTAGES OF LIQUID
15 CONCLUSION
3. INTRODUCTION
Cloud computing is the practice of using a remote servers hosted
on the internet to store manage and process data rather than a
local server or a personnel computer.
Figure : 1.A sample cloud computing network
4. VIRTUALIZATION and VIRTUAL
MACHINE
1 Virtualization deals with extending or replacing an existing
interface so as to mimic the behavior of another system
2 A Virtual Machine is a software that creates a virtualized
environment between the computer platform and the end user
in which the end user can operate software.
3 Crucial component in cloud computing.
4 Virtual Machine - Hypothetical Computer.
5 Executes programs like a physical machine.
6 Initial state of a virtual machine is stored in a file called
virtual Machine image.
6. DEDUPLICATION
1 Data Deduplication data compression technology.
2 Eliminates duplicate copies of repeating data.
3 A redundant data blocks are replaced so as to avoid the
problems regarding storage consumption of a large number of
VM images.
4 Improves storage utilization.
Figure : 3.Deduplicated file system
7. EXISTING SYSTEM
1 Hypervisors such as Xen, KVM etc.
2 Network Attached Storage (NAS)
3 Storage Area Network (SAN)
4 Direct Attached Storage (DAS)
8. ISSUES IN VM STORAGE
1 High demand on VM storage remains a challenging problem.
2 Existing systems have made efforts to reduce storage
consumption.
3 Uses SAN cluster.
4 Cannot satisfy increasing demand due to cost limitation.
5 Hence we propose LIQUID.
9. LIQUID SYSTEM ARCHITECTURE
1 Three components - Single meta server with hot back up
multiple data server and multiple clients.
2 Runs on user-level service process.
3 VM images are split into fixed size data blocks.
4 Meta server namespace , finger print , reference count.
5 Meta server mirrored to hot back up shadow meta server.
6 Data servers change of managing data blocks in VM images.
7 Organized in a distributed hash table.
8 A liquid client provides a POSIX compatible file system.
9 Client critical component (provides deduplication)
10 Fault tolerance Mirroring the meta server.
11 Replicas of data blocks are stored.
11. DEDUPLICATION IN LIQUID
1 Liquid chooses fixed size chunking instead of variable size
chunking.
2 Better since all files stored in VM images will be aligned on
disk block boundaries.
3 Advantage-simplicity.
4 Block size choice.
5 Block size- balancing factor which is hard to choose. Great
impact on both deduplication and io performance.
12. DEDUPLICATION IN LIQUID(CONT.)
1 Smaller block size-more random seeks when accessing a VM
image.
2 Not tolerable.
3 A large block size is also not preferable, it will reduce
deduplication ratio.
4 Liquid choose different block size under different situation.
5 Advised to use a multiplication of 4 kb between 256 kb and 1
MB to achieve good balance between IO performance and
deduplication ratio.
14. OPTIMIZATIONS ON FINGERPRINT
CALCULATION
∗ Rely on comparison of data block finger prints for redundancy.
∗ Finger print-collision resistant hash value calculated from data
block contents.
∗ MD5[26] and SHA-1[12] are frequently used for this purpose.
∗ Finger print collision - very small, orders of magnitude smaller
than hardware error rates.
∗ So we could safely assume that two data blocks are identical.
∗ Finger print calculation - expensive. ∗ Delays finger print
calculation for recently modified data blocks.
∗ Runs deduplication lazily only when it is necessary.
∗ Client side maintains a shared cache which contains recently
accessed data blocks.
15. OPTIMIZATIONS ON FINGERPRINT
CALCULATION (CONT.)
∗ A portion of memory is used by the client side of liquid as private
cache.
∗ Private cache hold-modified data blocks and delay finger print
calculation on them.
∗ Modified data block ejected from shared cache and added to
private cache.
∗ Modified data will be ejected if private cache becomes full.
∗ And ejected based on LRU policy.
∗ Only then will the modified data block’s finger print be
calculated.
∗ Liquid uses multiple threads for finger print calculation.
∗ Multiple threads will process different data blocks currently.
∗ Provides good IO performance.
16. FILE SYSTEM LAY OUT
1 All file system meta data are stored on the meta server.
2 Organized in a file system tree.
3 Client side could cache portions of file system meta data for
fast accesses.
4 When a VM is stopped ,modified meta data and data blocks.
5 Will be pushed back to meta server.
6 Data servers ensures modification on VM image is visible to
other client nodes.
18. COMMUNICATION AMONG COMPONENTS:
HEART BEAT PROTOCOL
1 META SERVER-manages all data servers.
2 Exchange regular heart beat message with each data server in
a ROUND ROBIN FASHION.
3 Detect failed data servers when there are many data servers.
4 To speed up failure detection data servers send an error signal
to meta server.
19. FAST CLONING FOR VM IMAGES
1 Copying large images may be time consuming.
2 Liquid provide efficient solution by means of fast cloning.
3 VM images represented by meta data files having reference to
data blocks.
4 By copying meta data file and updating reference count a
clone VM image is achieved.
5 Modification on cloned images will not effect the original
image.
21. GARBAGE COLLECTION
1 Removes unused garbage data blocks when running out of
space.
2 Reference counting of all data blocks are maintained by mete
servers.
3 Garbage collection request is issued periodically to data server.
4 Garbage collection is executed based on the data block
membership in the Bloom filter.
22. ADVANTAGES OF LIQUID
1 Fast Virtual Machine deployment with peer to peer data
transfer.
2 Low storage consumption by means of deduplication.
3 Instant cloning for virtual machine images.
4 On demand fetching through a network caching with local
disks.
5 LIQUID files has no specific limit.
23. CONCLUSION
1 Presented LIQUID which is a deduplication file system with
good IO performance.
2 Achieved by caching frequently accessed data blocks in
memory cache.
3 Avoids additional disk operations.
4 Deduplication of VM images proved to be effective.
24. REFERENCES
[1].Xun Zhao, Yang Zhang, Yongwei Wu, Kang Chen, Jinlei Jiang,
and Keqin Li, Senior Member, IEEE, Liquid: A Scalable
Deduplication File System for Virtual Machine ImagesIEEE
TRANSACTIONS ON PARALLEL AND DISTRIBUTED
SYSTEMS, VOL. 25, NO. 5, MAY 2014.
[2]AmazonMachineImage,Sept.2001.[Online]. Available
http://en.wikipedia.org/wiki/Amazon/Machine/Image.
[3]BittorrentProtocol,Sept.2011.[Online]. Available
http://en.wikipedia.org/wiki/BitTorrent/protocol.
[4]BloomFilter,Sept.2011.[Online]. Available
http://en.wikipedia.org/wiki/Bloom/filter
[5]Xfs:AHigh Performance Journaling Filesystem,
Sept.2011.[Online].http://oss.sgi.com/projects/xfs
26. [11]M. McLoughlin,The qcow2 Image Format, Sept. 2011.
[Online]. Available:
http://people.gnome.org/markmc/qcow-image-format.html.
[12]C. Tang, Fvd: A High-Performance Virtual Machine Image
Format for Cloud, in Proc. USENIX Conf. USENIX Annu. Tech.
Conf., 2011, p. 18.
[13]B. Zhu, K. Li, and H. Patterson,Avoiding the Disk Bottleneck
in the Data Domain Deduplication File System, in Proc. 6th
USENIX Conf. FAST, Berkeley, CA, USA, 2008, pp. 269-282.