• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content








Total Views
Views on SlideShare
Embed Views



15 Embeds 3,805

http://aikotobaha.blogspot.jp 1310
http://youyo.info 1033
http://old.youyo.info 733
http://aikotobaha.blogspot.com 550
http://aikotobaha.blogspot.kr 75
http://paper.li 42
http://webcache.googleusercontent.com 29
https://twitter.com 9
http://us-w1.rockmelt.com 9
http://translate.googleusercontent.com 7
http://twitter.com 3
http://cache.yahoofs.jp 2
http://a0.twimg.com 1
http://aikotobaha.blogspot.tw 1 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    はじめてのGlusterFS はじめてのGlusterFS Presentation Transcript

    • customer customer has found that the performance levels are acceptable, but increase capacity by If a has found that the performance levels are acceptable, but wants to wants to increase cld addcould add another 4, 1 TB each server, and will not generally generally experience performan another 4, 1 TB drives to drives to each server, and will not experience performance degrad
    • than 12). Note that in this case, they are adding 2 more low-price servers, and can simdrives. (See Config. C, above)If they want to both quadruple performance and quadruple capacity, they could distributeeach server would have 12,1 TB drives). (See Config. D, below)Note that by the time a solution has approximately 10 drives, the performance bottleneckmoved to the network. (See Config. D, above)
    • Ethernet network. Note that performance in this example is more than 25x that which we sais evidenced by an increase in performance from 200 MB/s in the baseline configuratioConfig. E, below)As you will note, the power of the scale-out model is that both capacity and performancmeet requirements. It is not necessary to know what performance levels will be needed 2,configurations can be easily adjusted as the need demands.
    • ➜ BETA_LINK="http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/glusterfs-3.3beta2.tar.gz"➜ wget $BETA_LINK➜ tar zxvf glusterfs-3.3beta2.tar.gz➜ cd glusterfs-3.3beta2➜ ./configure && make➜ sudo make install
    • # Start Gluster management daemon for each server➜ sudo /etc/init.d/glusterd start# Adding Servers to Trusted Storage Pool➜ for HOST in host1 host2 host3; do gluster peer probe $HOST; done#=> Probe successful Probe successful Probe successful
    • ➜ sudo gluster peer status#=> Number of Peers: 3 Hostname: host1 Uuid: 81982001-ba0d-455a-bae8-cb93679dbddd State: Peer in Cluster (Connected) Hostname: host2 Uuid: 03945cd4-7487-4b2c-9384-f006a76dfee5 State: Peer in Cluster (Connected)...
    • # Create a distribute Volume named ‘log’➜ SERVER_LOG_PATH=”/mnt/glusterfs/server/log”➜ sudo gluster volume create log transport tcp host1: $SERVER_LOG_PATH host2: $SERVER_LOG_PATH host3: $SERVER_LOG_PATH
    • $ sudo gluster volume start log # start Volume$ sudo gluster volume info log#=> Volume Name: logType: DistributeStatus: StartedNumber of Bricks: 12Transport-type: tcpBricks:Brick1: delta1:/mnt/glusterfs/server/logBrick2: delta2:/mnt/glusterfs/server/logBrick3: delta3:/mnt/glusterfs/server/log
    • # Create a distribute replicate Volume named ‘repository’➜ SERVER_REPO_PATH=”/mnt/glusterfs/server/repository”➜ sudo gluster volume create repository replica 2 transport tcp host1: $SERVER_REPO_PATH host2: $SERVER_REPO_PATH host3: $SERVER_REPO_PATH host4: $SERVER_REPO_PATH
    • $ sudo gluster volume info repository#=> Volume Name: repositoryType: Distributed-ReplicateStatus: StartedNumber of Bricks: 2 x 2 = 4Transport-type: tcpBricks:Brick1: delta1:/mnt/glusterfs/server/repositoryBrick2: delta2:/mnt/glusterfs/server/repositoryBrick3: delta3:/mnt/glusterfs/server/repositoryBrick4: delta3:/mnt/glusterfs/server/repository
    • # Mount ‘log’ Volume➜ CLIENT_LOG_PATH=”/mnt/glusterfs/client/log”➜ sudo mount -t glusterfs -o log-level=WARNING,log-file=/var/log/gluster.log localhost:log $CLIENT_LOG_PATH # native-client➜ sudo mount -t nfs -o mountproto=tcp localhost:log $CLIENT_LOG_PATH # nfs
    • ➜ df -h#=>Filesystem Size Used Avail Use% Mounted on/dev/sdb 1.9T 491G 1.4T 27% /mnt/disk2...localhost:repository 11T 4.0G 11T 1% /mnt/glusterfs/client/repositorylocalhost:log 8.3T 3.6T 4.3T 46% /mnt/glusterfs/client/log
    • # Get Physical Location➜ sudo getfattr -m . -n trusted.glusterfs.pathinfo  /mnt/glusterfs/client/repository/some_file#=> # file: /mnt/glusterfs/client/repository /some_file trusted.glusterfs.pathinfo="(    <REPLICATE:repository-replicate-0>     <POSIX:host1: /mnt/glusterfs/server/repository/some_file >  <POSIX:host2: /mnt/glusterfs/server/repository/some_file > )
    • Multi-site cascading Geo-replication Geo-replication over LAN You can configure GlusterFS Geo-replication to mirror data over a Local Area Network. Geo-replication over WAN You can configure GlusterFS Geo-replication to replicate data over a Wide Area Network. Geo-replication over WAN You can configure GlusterFS Geo-replication to replicate data over a Wide Area Network. Geo-replication over Internet You ds can configure GlusterFS Geo-replication to mirror data over the Internet. Geo-replication over InternetGluster File system Administration Guide_3.2_02_B Pg No. 47 You can configure GlusterFS Geo-replication to mirror data over the Internet.
    • !Figure!4!Centralized!Metadata!Approach
    • Figure 5, below, illustrates a typical distributed metadata server implementation. It can be seen that this approalso results in considerable overhead processing for file access, and by design has built-in exposurecorruption scenarios. Here again we see a legacy approach to scale-out storage not congruent withrequirement of the modern data center or with the burgeoning migration to virtualization and cloud computing. !Figure!5!Decentralized!Metadata!Approach
    • any office that stores physical documents in folders in filing cabinets, that person should be able to f -Similarly, one could implement an algorithmic approach to data storage that used a similarlocate files. For example, in a ten system cluster, one isk 10, etc. Figure 6, below illustrates this concept. !Figure!6:!Understanding!EHA:!Algorithm
    • and run it through the hashing algorithm. Each pathname/filename results in a unique numerical rFor the sake of simplicity, one could imagine assigning all files whose hash ends in the number 1all which end in the number 2 to the second disk, etc. Figure 7, below, illustrates this concept. !Figure!7!Understanding!EHA:!Hashing
    • 1. Setting up a very large number of virtual volumes 2. Using the hashing algorithm to assign files to virtual volumes 3. Using a separate process to assign virtual volumes to multiple physical devicesThus, when disks or nodes are added or deleted, the algorithm itself does not need to be changed. However,virtual volumes can be migrated or assigned to new physical locations as the need arises. Figure 8, below,illustrates the Glus !Figure!8!Understanding!EHA:!Elasticity
    • ➜ ll /mnt/glusterfs/server/result*---------T 1 doryokujin 0 Sep 13 16:40 result.host1-rw-r--r-- 1 doryokujin 4044654 Sep 13 16:40 result.host6