Gce Data Grid

482 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
482
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Gce Data Grid

  1. 1. GCE - DataGrid enabled via SRB 架構設計與說明 Ceasar Chen-Kai Sun Email: [email_address] NCHC DATE: 10/4/06
  2. 2. Outline <ul><li>Goal </li></ul><ul><ul><li>滿足 GCE 環境使用 SRB 資源與服務 </li></ul></ul><ul><li>Feature </li></ul><ul><ul><li>Service stack: 多重服務主機 </li></ul></ul><ul><ul><ul><li>Reliability </li></ul></ul></ul><ul><ul><li>使用者導向單一資源 (Logic Resource) </li></ul></ul><ul><ul><ul><li>Scalability </li></ul></ul></ul><ul><ul><li>最近資源使用 (Access data) 、資料多重備份 </li></ul></ul><ul><ul><ul><li>Performance 、 Reliability </li></ul></ul></ul><ul><ul><li>MCAT-SRB 備援 (Multi-MES) 、 MCAT DB 備份 </li></ul></ul><ul><ul><li>GSI-enable </li></ul></ul>
  3. 3. Outline(cont') <ul><li>Implement </li></ul><ul><ul><li>Infrastructure </li></ul></ul><ul><ul><ul><li>MCAT-SRB, noMCAT-SRB </li></ul></ul></ul><ul><ul><ul><li>MCAT DB backup </li></ul></ul></ul><ul><ul><ul><li>GSI </li></ul></ul></ul><ul><ul><li>Resource </li></ul></ul><ul><ul><ul><li>Physical resource </li></ul></ul></ul><ul><ul><ul><li>Logical resource </li></ul></ul></ul><ul><ul><ul><li>Auto multi-replica for data </li></ul></ul></ul><ul><li>How to </li></ul><ul><ul><li>Add as a SRB/Resource node </li></ul></ul><ul><ul><li>Use SRB resource within GCE computing </li></ul></ul>
  4. 4. SRB service layer <ul><li>Use multi-server provide </li></ul>master-mcat Slave mcat DB Backup Server nonMCAT-server nonMCAT-server nonMCAT-server Computing node Computing node Computing node Computing node Infrastructure SRB service mcat-DB ... other slave mcat
  5. 5. SRB service layer(cont's) <ul><li>SRB service </li></ul><ul><ul><li>Use multi-server to provide the same service </li></ul></ul><ul><ul><li>避免單點錯誤 </li></ul></ul><ul><ul><li>Both at MES and non-MES </li></ul></ul><ul><li>Backup MCAT DB periodically </li></ul><ul><ul><li>但尚未有自動接管機制 </li></ul></ul><ul><li>GSI Enable </li></ul><ul><li>Use Master/Slave MES to provide mcat service </li></ul><ul><ul><li>Load balance </li></ul></ul><ul><ul><li>Failed over </li></ul></ul><ul><ul><li>http://www.sdsc.edu/srb/index.php/Master_Slave_MCAT </li></ul></ul><ul><ul><li>Some feature has not been Implemented yet </li></ul></ul>Infrastructure
  6. 6. Data Resource layer SRB host A SRB host B SRB host C SRB host D Computing node Resource A.input A.arch C.arch B.input B.arch Physical resource Logical resource gce-arch gce-upload B.input data Sync
  7. 7. Data Resource layer (cont's) <ul><li>Logical resource </li></ul><ul><ul><li>將 source type 分類,並入同一個 logical </li></ul></ul><ul><ul><li>Load balance </li></ul></ul><ul><ul><li>Clients never mind which the physical data be stored </li></ul></ul><ul><ul><ul><li>Always uses Logical resource </li></ul></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><ul><li>Easy to add/remove a physical resource as a special function (Ex: Arch , data-upload,...) </li></ul></ul></ul><ul><li>Data Access Issue </li></ul><ul><ul><li>Data synchronization between resource </li></ul></ul><ul><ul><ul><li>Use data multi-replica mechanism to synchrony data between all logical resources </li></ul></ul></ul><ul><ul><li>GET/PUT priority </li></ul></ul>Resource
  8. 8. Data Resource layer (cont's) <ul><li>Access Issue: </li></ul><ul><ul><li>The &quot;GET&quot; priority between resource </li></ul></ul><ul><ul><ul><li>Always pick copy stored in cache/temporary class first andarchival/permanent class last </li></ul></ul></ul><ul><ul><ul><li>Within the same class, pick the copy that is stored on the same host as the host where is client is connected to first </li></ul></ul></ul><ul><ul><ul><li>If i) and ii) are the same, then pick one randomly </li></ul></ul></ul><ul><ul><li>The &quot;PUT&quot; priority between resource </li></ul></ul><ul><ul><ul><li>必須指定 resource name ( 由 -S 或 Senv 指定 ) </li></ul></ul></ul><ul><ul><ul><li>如果是 Logical resource ,以 client 聯接上的 SRB server 上有的 physical Resource 為主 </li></ul></ul></ul><ul><ul><ul><li>如果沒有,則隨機選擇 </li></ul></ul></ul><ul><ul><li>http://www.sdsc.edu/srb/index.php/Advanced_Scommands </li></ul></ul>Resource
  9. 9. <ul><li>Why </li></ul><ul><ul><li>Reliability :Mutli-replica data </li></ul></ul><ul><ul><li>Access Performance: Nearest accessibility </li></ul></ul><ul><li>How </li></ul><ul><ul><li>Srsync 、 Sbkupsrb </li></ul></ul><ul><ul><li>Logical resource(LR) </li></ul></ul><ul><li>Procedure </li></ul><ul><ul><li>Upload new object via using LR-upload </li></ul></ul><ul><ul><li>Make multi-replica data in LR-arch via using Sbkupsrb </li></ul></ul><ul><ul><ul><li>將資料複製致所有 physical resource </li></ul></ul></ul><ul><ul><ul><li>將所有副本資料同步為最新版本 </li></ul></ul></ul><ul><ul><li>Client download data and do the computing </li></ul></ul><ul><ul><li>Upload update and new data as step 1 </li></ul></ul><ul><li>The step 2 need be executed at SRB system automatically </li></ul>Auto multi-replica mechanism
  10. 10. Auto multi-replica mechanism (Sample) <ul><li>The initial upload data </li></ul>$ Sls -al /home/srbadmin.gce-srb: srbadmin 0 gcenode1.upload 128 2006-09-26-17.01 % input.dat srbadmin 0 gcenode1.upload 620 2006-09-26-17.00 % pika191.ifconfig.txt <ul><li>資料經過 auto multi-replica </li></ul>$ Sls -al /home/srbadmin.gce-srb: srbadmin 0 gcenode1.upload 128 2006-09-26-17.01 % input.dat srbadmin 1 pika191.storage 128 2006-09-26-17.08 % input.dat srbadmin 2 gce-ws.storage 128 2006-09-26-17.08 % input.dat srbadmin 0 gcenode1.upload 620 2006-09-26-17.00 % pika191.ifconfig.txt srbadmin 1 pika191.storage 620 2006-09-26-17.08 % pika191.ifconfig.txt srbadmin 2 gce-ws.storage 620 2006-09-26-17.08 % pika191.ifconfig.txt <ul><li>說明: </li></ul><ul><li>資料經過 auto multi-replica 後可以在所指定的 LR 中有的 PR 皆存一份副本,由於 GET 的修先權順序,可提供下一次存取時有『最近存取的』可能性。 </li></ul>
  11. 11. <ul><li>DB service </li></ul><ul><ul><li>PostgreSQL: gridportal </li></ul></ul><ul><ul><li>Backup DB: gce-ws </li></ul></ul><ul><li>SRB Service </li></ul><ul><ul><li>MES(Mcat Enabled SRB): gce-ws, pika191 </li></ul></ul><ul><ul><li>NonMES: gcenode1 </li></ul></ul><ul><li>GSI DN: </li></ul><ul><ul><li>C=tw, O=nchc, OU=Grid, CN=srb-gsi/gce-ws.nchc.org.tw </li></ul></ul><ul><li>Resource </li></ul><ul><ul><li>PR: gce-ws.storage,gce-ws.upload, pika191.storage, pika191.upload, gcenode1.storage, gcenode1.upload </li></ul></ul><ul><ul><li>LR: </li></ul></ul><ul><ul><ul><li>gce-arch : gce-ws.storage, pika191.storage, gcenode1.storage </li></ul></ul></ul><ul><ul><ul><li>gce-upload: gce-ws.upload, pika191.upload, </li></ul></ul></ul>Real GCE-SRB Environment
  12. 12. How to use SRB resource in GCE <ul><li>Install sMover in each computing node </li></ul><ul><ul><li>Download URL/Run install.sh </li></ul></ul><ul><ul><li>Must be Globus enabled </li></ul></ul><ul><li>sMover function </li></ul><ul><ul><li>Check out data </li></ul></ul><ul><ul><li>Check in data </li></ul></ul><ul><li>Procedure </li></ul><ul><ul><li>Get GCE-SRB need parameter from portal </li></ul></ul><ul><ul><li>Make sure to initial user's GSI proxy </li></ul></ul><ul><ul><li>Run data check-out before real computing </li></ul></ul><ul><ul><li>Run computing </li></ul></ul><ul><ul><li>Run data check-in after real computing </li></ul></ul>
  13. 13. sMover check-in data 範例 <ul><li>sMover 提供 get , put, ci, co 四種方式 </li></ul><ul><li>sMover check in data :使用 gce-upload 這個 resource </li></ul><ul><li>/opt/sMover/bin/sSmover &quot;[srb_serve]&quot; &quot;[port]&quot; &quot;[account]&quot; &quot;[domain]&quot; &quot;[resource]&quot; &quot; ci &quot; &quot;[local_path]&quot; &quot;[server_path]&quot; &quot;[server_DN]&quot; </li></ul><ul><li>以 gce 環境中以 gce-ws 為 SRB 主機做 check in data 動作 </li></ul><ul><li>$ /opt/sMover/bin/sSmover &quot; gce-ws.nchc.org.tw &quot; &quot;5544&quot; &quot;srbadmin&quot; &quot;gce-srb&quot; &quot; gce-upload &quot; &quot; ci &quot; &quot;/home/gceuser/gce-run&quot; &quot;/gce-run&quot; &quot;/C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw&quot; </li></ul><ul><ul><li>在 gce-srb 環境中查詢可得 </li></ul></ul><ul><li>$ Sls -al /home/srbadmin.gce-srb/gce-run </li></ul><ul><li>/home/srbadmin.gce-srb/gce-run: </li></ul><ul><li>srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat </li></ul><ul><li>srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh </li></ul><ul><li>srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat </li></ul><ul><ul><li>資料經過 auto multi-replica 後 </li></ul></ul><ul><li>$ Sls -al /home/srbadmin.gce-srb/gce-run </li></ul><ul><li>/home/srbadmin.gce-srb/gce-run: </li></ul><ul><li>srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat </li></ul><ul><li>srbadmin 1 gce-ws.storage 296 2006-09-26-18.09 % input.dat </li></ul><ul><li>srbadmin 2 pika191.storage 296 2006-09-26-18.09 % input.dat </li></ul><ul><li>srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh </li></ul><ul><li>srbadmin 1 gce-ws.storage 620 2006-09-26-18.09 % my-run.sh </li></ul><ul><li>srbadmin 2 pika191.storage 620 2006-09-26-18.09 % my-run.sh </li></ul><ul><li>srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat </li></ul><ul><li>srbadmin 1 gce-ws.storage 1997 2006-09-26-18.09 % output.dat </li></ul><ul><li>srbadmin 2 pika191.storage 1997 2006-09-26-18.09 % output.dat </li></ul>
  14. 14. sMover check-out data 範例 <ul><li>sMover check-out data </li></ul><ul><ul><li>使用 gce-arch 這個 resource </li></ul></ul><ul><ul><li>/opt/sMover/bin/sSmover &quot;[srb_serve]&quot; &quot;[port]&quot; &quot;[account]&quot; &quot;[domain]&quot; &quot;[resource]&quot; &quot; co &quot; &quot;[local_path]&quot; &quot;[server_path]&quot; &quot;[server_DN]&quot; </li></ul></ul><ul><li>以 gce 環境中以 gcenode1 為 SRB 主機做 check out data 動作 </li></ul><ul><li>$ /opt/sMover/bin/sSmover &quot; gcenode1.nchc.org.tw &quot; &quot;5544&quot; &quot;srbadmin&quot; &quot;gce-srb&quot; &quot; gce-arch &quot; &quot; co &quot; &quot;/home/gceuser/gce-run&quot; &quot;/gce-run&quot; &quot;/C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw&quot; </li></ul><ul><ul><li>在 gce-srb 環境中查詢可得 </li></ul></ul><ul><li>$ Sls -al /home/srbadmin.gce-srb/gce-run </li></ul><ul><li>/home/srbadmin.gce-srb/gce-run: </li></ul><ul><li>srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat </li></ul><ul><li>srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh </li></ul><ul><li>srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat </li></ul><ul><ul><li>資料經過 auto multi-replica 後 </li></ul></ul><ul><li>$ Sls -al /home/srbadmin.gce-srb/gce-run </li></ul><ul><li>/home/srbadmin.gce-srb/gce-run: </li></ul><ul><li>srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat </li></ul><ul><li>srbadmin 1 gce-ws.storage 296 2006-09-26-18.09 % input.dat </li></ul><ul><li>srbadmin 2 pika191.storage 296 2006-09-26-18.09 % input.dat </li></ul><ul><li>srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh </li></ul><ul><li>srbadmin 1 gce-ws.storage 620 2006-09-26-18.09 % my-run.sh </li></ul><ul><li>srbadmin 2 pika191.storage 620 2006-09-26-18.09 % my-run.sh </li></ul><ul><li>srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat </li></ul><ul><li>srbadmin 1 gce-ws.storage 1997 2006-09-26-18.09 % output.dat </li></ul><ul><li>srbadmin 2 pika191.storage 1997 2006-09-26-18.09 % output.dat </li></ul>
  15. 15. 整合議題 (with Computing ) <ul><li>Computing node 需求 </li></ul><ul><ul><li>Globus enable : For gce , it needs signed with NCHC CA </li></ul></ul><ul><ul><li>安裝 sMover: http://gridportal.nchc.org.tw/gce-SRB_Installation/srbSmover.i386.20060919.tgz </li></ul></ul><ul><ul><li>Co -> do computing -> ci </li></ul></ul><ul><li>利用 sMover 來同步計算資料,並產生多重副本,需要幾項參數 </li></ul><ul><ul><li>已定義在設定檔中 </li></ul></ul><ul><ul><ul><li>defaultArchR=gce-arch ( 用來產生多重副本 ) </li></ul></ul></ul><ul><ul><ul><li>defaultZone=gcezone </li></ul></ul></ul><ul><ul><li>可由參數讀入 </li></ul></ul><ul><ul><ul><li>&quot;[srb_serve]&quot; : gce-ws | pika191 | gcenode1 : 如何決定 </li></ul></ul></ul><ul><ul><ul><li>&quot;[port]&quot; : 5544 </li></ul></ul></ul><ul><ul><ul><li>&quot;[account]&quot; : SRB 上所使用帳號 , come from portal ( 獨立或共用 ) </li></ul></ul></ul><ul><ul><ul><li>&quot;[domain]&quot; : gce-srb </li></ul></ul></ul><ul><ul><ul><li>&quot;[resource]&quot; : gce-upload </li></ul></ul></ul><ul><ul><ul><li>&quot; ci|co &quot; : check in or check out </li></ul></ul></ul><ul><ul><ul><li>&quot;[local_path]&quot; : 計算主機上路徑 </li></ul></ul></ul><ul><ul><ul><li>&quot;[server_path]&quot; : SRB 上相對於 user home 路徑 /home/account.gce-srb/xxx </li></ul></ul></ul><ul><ul><ul><li>&quot;[server_DN]&quot; : /C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw ? Fix or not ? </li></ul></ul></ul><ul><ul><li>做完 check -in 後即會產生多重副本以便下次的存取 </li></ul></ul>
  16. 16. 整合議題 (with Portal ) <ul><li>gce 相關: </li></ul><ul><ul><li>列表如何取得? (From config file , ...) </li></ul></ul><ul><ul><li>讓使用者選擇聯接之 SRB server </li></ul></ul><ul><li>Data-management portel </li></ul><ul><ul><li>File upload </li></ul></ul><ul><ul><li>File management </li></ul></ul>
  17. 17. How to add as a SRB node <ul><li>http://gridportal.nchc.org.tw/gce-SRB_Installation/ </li></ul><ul><li>安裝主程式 </li></ul><ul><li>安裝 GSI 相關 </li></ul><ul><li>安裝啟動服務 </li></ul><ul><li>安裝其他附加工具 ( 選項 ) </li></ul><ul><li>System admin </li></ul><ul><ul><li>Add location </li></ul></ul><ul><ul><li>Add physical resource </li></ul></ul><ul><ul><li>Add PR into LR (logical resource) </li></ul></ul>
  18. 18. Reference <ul><li>SDSC SRB </li></ul><ul><ul><li>http://www.sdsc.edu/srb/ </li></ul></ul><ul><li>Scommands for advanced users </li></ul><ul><ul><li>http://www.sdsc.edu/srb/index.php/Scommands#Advanced_Users </li></ul></ul><ul><li>Mcat SRB GSI-enable Server 安裝、管理 </li></ul><ul><ul><li>http://www.ceasar.tw/modules/news/article.php?storyid=9 </li></ul></ul><ul><li>How to use Scommands on SRB server </li></ul><ul><ul><li>http://www.ceasar.tw/modules/news/article.php?storyid=8 </li></ul></ul>

×