SlideShare a Scribd company logo
1 of 85
Download to read offline
1ZUIPO
84.3 
顆罏ךؚٗ꧊鎘 
1Z$PO+1
 
LJSB$IJLV
BDIJLV	
 
 
/BNFLJSB$IJLV 
5XJUUFS!@BDIJLV 
(JU)VC!BDIJLV 
 
馯㄂拦莸	LJSB$IJLV'JSFד嗚稊
 
耵噟ؒٝآص،!LBONV
(PBM 
 
涺ׁ׿ך鑧׾耀ֹ׋ְ	荈ⴓָ
 
Ø չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 
Ø չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO
.3崞欽倯岀ךⰟ剣
,BONV	#VTJOFTT
 
 
Ø ؕ٦س⠓爡ה⼿噟׃׋寸幥ر٦ةⴓ匿 
Ø ؕ٦سח秡בֻؙ٦هٝךꂁ⥋ 
Ø $BSE-JOLFE0FS	$-0
$BSE-JOLFE0FS
$BSE-JOLFE0FS	
 
չ䏄ךؙ٦هٝؒٝزٔ٦׃ 
׋ؕ٦سד顠ְ暟ׅ׸לه 
 
؎ٝز؜حزկպ 
ؕ٦س 
⠓爡 
,BONV 
չְֲֲֶֿ㹏ׁ׿ח 
ְֲֲֿؙ٦هٝ⳿׃׋ְպ 
 
չְֲֲֿ飑顠⫘ぢךֶ㹏ׁ׿ך倯 
ְְָךדכպ 
 
չֿ׿ז穠卓׌׏׋ךדծ如㔐כֿ 
ְֲֲإًؚٝزⴖ׶ת׃׳ֲպ 
ؕ٦س 
⠓㆞ 
ֶ䏄
2VJDL4VSWFZ 
 
Ø ؚٗⴓ匿חꟼ׻׏גְ׵׏׃ׯ׷倯 
Ø )BEPPQ⢪׏ג׵׏׃ׯ׷倯 
Ø )JWF⢪׏ג׵׏׃ׯ׷倯 
Ø .3⢪׏ג׵׏׃ׯ׷倯
չז׈׉ך圓䧭זךַպ 
ח搊挿׾䔲ג׋✲⢽ךⰟ剣
䒦爡ך⵸䲿
ֿךز٦ؙךة؎زٕ
顆罏ךؚٗ꧊鎘
1PPSNBOˏT  
Ø ➙֮׷植朐׾⯋ח 
Ø 満ٔا٦أ	➂儗꟦穗꿀
ד湡涸׾麦䧭ׅ׷㪦⹲ 
Ø 湡涸׾麦䧭ׅ׷أؾ٦س׾〳腉זꣲ׶♳־׷㪦⹲ 
Ø 搀欽ז佄⳿׾鼘ֽ׷㪦⹲
,BONV	OHJOFFS5FBN
 
 
NBLJ 
	$0OHJOFFS
 
@JEFZVUB 
	%FTJHOFS
 
NPRBEB 
	OHJOFFS
 
@BDIJLV 
	OHJOFFS
 
爡ꞿ噟灇瑔Ꟛ涪 
رؠ؎ٝؿٗٝز 
أوم،فؚٔٗⴓ匿 
ؿٗٝزغحؙؒٝس 
؎ٝؿٓأوم،فٔ 
غحؙؒٝس؎ٝؿٓ 
ⴓ匿㛇湍ؚٗⴓ匿㼎ػ٦زش٦璞〡
3FRVJSFNFOUT 
Ø ֮׷玎䏝ךꆀחז׷ر٦ة׾أزٖأ搀ֻ꧊鎘׃׋ְ 
• WF(EBZ
.BY(EBZ	ꬊ㖇簭
 
• (#
剢	ꬊ㖇簭
 
• ؟٦ؽأך䧭ꞿהⰟח㟓ִ׷鋅鴥׫ 
• 剢⽃⡘ד،سمحؙזؙؒٔ׮䫎־׋ְ 
Ø 爡ⰻח㣐鋉垷ر٦ة׾Ⳣ椚ׅ׷濼鋅׾顕׭׋ְ 
• չل٦أꂁⴓ׾׃׋♳דպ濼鋅׾顕׭׷ 
• 㢩鿇ח⳿׃חְֻإٝءذ؍ـזر٦ة׮㶷㖈 
Ø 麊欽؝أزⴱ劍䫎项׾⡚ֻ䫇ִ׋ְ
/PU3FRVJSFNFOUT 
Ø ⴓ匿ָٔ،ٕة؎يד֮׷䗳銲䚍כ植朐넝ֻזְ 
Ø ،سمحؙⴓ匿㛇湍ך؟٦ؽأٖكٕכ寸׃ג넝ֻזְ 
• 兛鸐ךغحثⳢ椚כ衅׍גכ꼽湡׌ֽו 
• 䌢חⵃ欽〳腉ז朐䡾חז׏גְזֻג׮葺ְ 
• ⵃ欽כ爡ⰻחꣲ㹀ׁ׸גְ׷ 
Ø ׋׌׃ծ♳鎸ָ3FRVJSFNFOUTחז׷〳腉䚍כ⼧ⴓ剣׷
NB[POMBTUJD.BQ3FEVDF
84.3 
Ø 侧֮׷84؟٦ؽأךֲ׍ך♧א 
Ø )BEPPQװ)BEPPQؒ؝ءأذيⰻך48ָر 
ؿٕؓزדⵃ欽〳腉 
Ø 1*ד饯⹛ծ+PCך㹋遤ծ⨡姺׾乼⡲〳腉 
Ø ٌصةؚٔٝ瘝׮״׃זח㹋倵׃גֻ׸׷ 
Ø 4׾)%'4ך剏׶חⵃ欽〳腉 
Ø ؙٓأةך〴侧㢌刿ָ㺁僒
SDIJUFDUVSF 
盖椚؟٦غ 
ؙ٦هٝ 
ꂁ⥋؟٦غ 
ؙ٦هٝ 
ꂁ⥋؟٦غ 
• ꂁ⥋؟٦غ♳ך'MVFOUEדؚٗ꧊《 
• VFOUETQMVHJOד ꧊׃׋ؚٗ׾ 
4♳ח⥂㶷 
• .3♳ך)JWFדؚٗ׾⸇䊨ծ꧊鎘 
• ꧊鎘⦼׾3%4ח⥂㶷׃ג〳鋔⻉
%BUBOBMZTJT'MPX	CZUBHPNPSJT
 
 
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F 
⳿ⰩIUUQXXXTMJEFTIBSFOFUUBHPNPSJTIBOEMJOHOPUTPCJHEBUB
1PPSNBOˏT%BUBOBMZTJT'MPX 
 
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F
$PMMFDU 
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F
$PMMFDU 
Ø ؙ٦هٝꂁ⥋؟٦غַ׵'MVFOUE
VFOUET 
QMVHJO׾ⵃ欽׃גؚٗ׾굲לׅ 
Ø 굲לؚׅٗכِ٦ؠך،ؙءّٝ׾2VFSZ4USJOH 
חろ׭ג굲לׅ 
• 醱꧟ז+40/כ굲לׁ׆ծ2VFSZ4USJOHח䞔㜠鯹ׇ׷ 
• )JWFדך꧊鎘儗חⰋג+40/ח㢌䳔 
• IUUQTFYBNQMFDPNCFBDPO TVCPCKDPVQPOBDUJPODMJDLDJE 
Ø 'MVFOUE꧊秈؟٦غכⵃ欽׃זְ 
• ٔ،ٕة؎ي꧊鎘ך䗳銲䚍כ植朐넝ֻזְ 
• ⱔꞿ圓䧭׮罋ִילז׵׆醱꧟חז׷ 
• 4ך㸜㹀䠬חֶ⟣ׇ׃׋ְ
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F 
4UPSF
4UPSF 
Ø ה׶ִ֮׆4ח굲לׅ 
Ø 4ךغ؛حزכ劤殢嗚鏾דⴓֽגֶֻ 
• غ؛حز⽃⡘ד،ؙإأ؝ٝزٗ٦ٕ〳腉 
• FYBNQMFDPNQSPEVDUJPOMPH 
Ø ؟٦غ䕵ⶴⴽחؗ٦׾ⴓֽגֶֻ 
• ⴽ؟٦غָ㟓ִג׮㸜䗰 
• FYBNQMFDPNQSPEVDUJPOMPHBQJ 
Ø 傈ⴽחؗ٦׾ⴓֽגֶֻ 
• )JWFךػ٦ذ؍ءّٝ׾ⵃ欽ׅ׷捀 
• FYBNQMFDPNQSPEVDUJPOMPHBQJEU
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F 
1SPDFTT
1SPDFTT 
Ø ⥋걾ה㹋籐ך㢸꟦غحث 
• 盖椚؟٦غַ׵)BEPPQ
)JWFך.3׾饯⹛ 
• 'MVFOUE꧊秈؟٦غ׾ⵃ欽׃גְזְ捀稢ⴖ׸הז׏׋ؚٗؿ؋؎ 
ٕ׾㖇簭ծ穠さ	)BEPPQכ稢ⴖ׸㼭ְׁؿ؋؎ٕךⳢ椚蕱䩛
 
• ؚٗח鎸ꐮׁ׸גְ׷2VFSZ4USJOH׾6%'׾ⵃ欽׃ג+40/ח㢌䳔 
• 鋅׷ץֹ鯥ד꧊鎘׃ג⥂㶷 
• ♳鎸Ⰻגך1SPDFTT׾)%'4חر٦ة׾衅הׁ׆4׾ⵃ欽׃ג㹋遤 
• 剑穄涸ז꧊鎘⦼׾3%4ח呓秛 
Ø 厫鮾ד鸞ְ儎꟦ؙؒٔ 
• 盖椚؟٦غַ׵)BEPPQ
)JWF
1SFTUPך.3׾饯⹛ 
• 1SFTUPָ)JWFךًةأز،	ذ٦ـٕ㹀纏
׾⿫撑 
• ر٦ةכⰋג4♳ח֮׷
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F 
7JTVBMJ[F
7JTVBMJ[F 
Ø .3ד꧊鎘׃׋ر٦ة׾.Z42-חٗ٦س 
Ø 盖椚؟٦غ♳ד⹛ֻ؟٦ؽأ׾ⵃ欽׃ג⦼׾〳鋔⻉ 
• ًٝغ٦Ⰻ㆞ָずׄ⦼׾鋅ג侧⦼然钠 
Ø ⡭׏ג׷爡ⰻ؟٦غח鑐꿀涸חMBTUJDTFBSDI
,JCBOB׾ 
㼪Ⰵ 
• ر٦ة׾䒚׶זָ׵ⴓ匿鯥׾罋ִ׋ְ儗ח⤑ⵃ
1PPSNBOˏT%BUBOBMZTJT'MPX 
 
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F
:(/*
ד׮䗳銲חז׏׋׵鷄⸇דֹ׷
鑥תזְ״ֲח׃גֶֻ
1PPSNBOˏT%BUBOBMZTJT'MPX 
 
1SPDFTT 
$PMMFDU 1BSTF 
$MFBOVQ 
4UPSF 1SPDFTT 
7JTVBMJ[F
3FGFSFODFT 
Ø 84NB[PO.3#FTU1SBDUJDFT 
• ؝ٖ׾铣׭ל荈ⴓ麦ך؝ٝذؙأزחさ׏׋.3圓䧭ָ׻ַ׷կ 
)BEPPQךⰅꟌה׃ג׮葺ְךדכկ 
Ø NJYJך鍑匿㛇湍הQBDIF)JWFדך+40/ػ٦؟ 
ך崞欽ך稱➜ 
• +40/ד顕׭ג7JFXדذ٦ـٕ׏שֻ䪔ֲ،؎ر؍،׾顗׏׋կؚٗ 
꧊鎘חꟼ׻׷➂麦ך؝ىُص؛٦ءّٝ؝أزծהְֲ嚊䙀׮顗׏׋կ 
Ø #BUDI1SPDFTTJOHBOE4USFBN1SPDFTTJOHCZ42- 
• ֿךز٦ؙ׾耀ְגⴓ匿㛇湍ח.11禸ؒٝآٝ׾ⵃ欽ׅ׷✲׾寸䠐կ 
*NQBMBה1SFTUP׾嫰鯰׃ծ4ח׮湫䱸ؙؒٔ׾䫎־׸׷1SFTUP׾㼪 
Ⰵ׃׋կ	*NQBMB׮如劍غ٦آّٝדכ4ח湫䱸ؙؒٔ䫎־׸׷׵׃ 
ְךד׉ך儗חⱄ䏝嗚鏾✮㹀
չⰅꟌ⟃♳պ׾湡䭷׃׋ 
1ZUIPO
.3崞欽倯岀ךⰟ剣 
	ؚٗ꧊鎘
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
BXTDMJ 
Ø ٔٔ٦أך7FSַ׵.3堣腉ך1SFWJFX 
أذ٦ةأָ《׸ծ兦׸ג㸜㹀׃׋1*ה׃גⵃ欽〳腉 
Ø ➙תדرؿ؋ؙز׌׏׋3VCZךMBTUJD.BQ3FEVDFأؙ 
ٔفزַ׵⛦׶䳔ִ 
• QJQד知⽃ח؎ٝأز٦ٕדֹ׷ 
• ⟃⵸ַ׵BXTDMJ׾⢪׏ג׷ךדخ٦ٕ窟♧ 
• (JU)VC♳דךꟚ涪ָ崞涪ד13׮⳿ׇ׷
8F-PWF1ZUIPO
$ 
mkvirtualenv 
pycon-­‐emr-­‐dev 
(pycon-­‐emr-­‐dev)$ 
pip 
install 
awscli 
(pycon-­‐emr-­‐dev)$ 
mkdir 
~/.awscli 
(pycon-­‐emr-­‐dev)$ 
cat 
-­‐EOF 
 
~/.awscli/config 
[profile 
development] 
aws_access_key_id=development_access_key 
aws_secret_access_key=development_secret_key 
region=ap-­‐northeast-­‐1 
EOF 
(pycon-­‐emr-­‐dev)$ 
cat 
-­‐EOF 
 
$VIRTUAL_ENV/bin/activate 
export 
AWS_CONFIG_FILE=~/.awscli/config 
export 
AWS_DEFAULT_PROFILE=development 
source 
aws_zsh_completer.sh 
EOF
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
$ 
aws 
emr 
create-­‐cluster 
-­‐-­‐ami-­‐version 
3.1.1 
 
-­‐-­‐name 
'PyConJP 
2014 
(AMI 
3.1.1 
Hive)' 
 
-­‐-­‐tags 
Name=pycon-­‐jp-­‐emr 
environment=development 
 
-­‐-­‐ec2-­‐attributes 
KeyName=yourkey 
-­‐-­‐log-­‐uri 
's3://yourbucket/jobflow_logs/' 
 
-­‐-­‐no-­‐auto-­‐terminate 
 
-­‐-­‐visible-­‐to-­‐all-­‐users 
 
-­‐-­‐instance-­‐groups 
file://./normal-­‐instance-­‐setup.json 
 
-­‐-­‐applications 
file://./app-­‐hive.json
[ 
{ 
OPSNBMJOTUBODFHSPVQKTPO BQQIJWFKTPO 
Name: 
emr-­‐master, 
InstanceGroupType: 
MASTER, 
InstanceCount: 
1, 
InstanceType: 
m1.medium 
}, 
{ 
Name: 
emr-­‐core, 
InstanceGroupType: 
CORE, 
InstanceCount: 
2, 
InstanceType: 
m1.medium 
} 
] 
[ 
{ 
Name: 
HIVE 
} 
]
SFTVMU 
{ 
ClusterId: 
j-­‐8xxxxxxxxx 
}
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
$ 
aws 
emr 
add-­‐steps 
-­‐-­‐cluster-­‐id 
j-­‐8xxxxxxxxx 
 
-­‐-­‐steps 
file://./hive-­‐sample-­‐step-­‐1.json
[ 
{ 
IJWFTBNQMFTUFQKTPO 
Args: 
[ 
-­‐f, 
s3n://yourbucket/hive-­‐script/sample01.hql, 
-­‐d, 
BUCKET_NAME=yourbucket, 
-­‐d, 
TARGET_DATE=20140818 
], 
ActionOnFailure: 
CONTINUE, 
Name: 
Hive 
Sample 
Program 
01, 
Type: 
HIVE 
}, 
{ 
Args: 
[ 
-­‐f, 
s3n://yourbucket/hive-­‐script/sample02.hql, 
-­‐d, 
BUCKET_NAME=yourbucket, 
-­‐d, 
TARGET_DATE=20140818 
], 
ActionOnFailure: 
CONTINUE, 
Name: 
Hive 
Sample 
Program 
02, 
Type: 
HIVE 
} 
]
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
$ 
aws 
emr 
add-­‐steps 
-­‐-­‐cluster-­‐id 
j-­‐8xxxxxxxxx 
 
-­‐-­‐steps 
file://./s3distcp-­‐sample-­‐step.json
[ 
{ 
TEJTUDQTBNQMFTUFQKTPO 
Name: 
s3distcp 
Sample, 
ActionOnFailure: 
CONTINUE, 
Jar: 
/home/hadoop/lib/emr-­‐s3distcp-­‐1.0.jar, 
Type: 
CUSTOM_JAR, 
Args: 
[ 
-­‐-­‐src, 
s3n://yourbucket/access_log/dt=20140818, 
-­‐-­‐dest, 
s3n://yourbucket/compressed_log/dt=20140818, 
-­‐-­‐groupBy, 
.*(nginx_access_log-­‐).*, 
-­‐-­‐targetSize, 
100, 
-­‐-­‐outputCodec, 
gzip 
] 
} 
]
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
$ 
aws 
emr 
create-­‐cluster 
-­‐-­‐ami-­‐version 
3.1.1 
 
-­‐-­‐name 
'PyConJP 
2014 
(AMI 
3.1.1 
Hive)' 
 
-­‐-­‐tags 
Name=pycon-­‐jp-­‐emr 
environment=development 
 
-­‐-­‐ec2-­‐attributes 
KeyName=yourkey 
-­‐-­‐log-­‐uri 
's3://yourbucket/jobflow_logs/' 
 
-­‐-­‐no-­‐auto-­‐terminate 
 
-­‐-­‐visible-­‐to-­‐all-­‐users 
 
-­‐-­‐instance-­‐groups 
file://./normal-­‐instance-­‐setup.json 
 
-­‐-­‐applications 
file://./app-­‐hive-­‐with-­‐config.json
[ 
{ 
BQQIJWFXJUIDPOHKTPO 
Args: 
[ 
-­‐-­‐hive-­‐site=s3://yourbucket/libs/config/hive-­‐site.xml 
], 
Name: 
HIVE 
} 
]
IJWFTJUFYNM 
?xml 
version=1.0? 
?xml-­‐stylesheet 
type=text/xsl 
href=configuration.xsl? 
configuration 
property 
namehive.optimize.s3.query/name 
valuetrue/value 
descriptionOptimize 
query 
on 
S3/description 
/property 
/configuration
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
$ 
aws 
emr 
create-­‐cluster 
-­‐-­‐ami-­‐version 
3.1.1 
 
-­‐-­‐name 
'PyConJP 
2014 
(AMI 
3.1.1 
Hive 
+ 
Presto)' 
 
-­‐-­‐tags 
Name=pycon-­‐jp-­‐emr 
environment=development 
 
-­‐-­‐ec2-­‐attributes 
KeyName=yourkey 
-­‐-­‐log-­‐uri 
's3://yourbucket/jobflow_logs/' 
 
-­‐-­‐no-­‐auto-­‐terminate 
 
-­‐-­‐visible-­‐to-­‐all-­‐users 
 
-­‐-­‐instance-­‐groups 
file://./normal-­‐instance-­‐setup.json 
 
-­‐-­‐bootstrap-­‐actions 
file://./bootstrap-­‐presto.json 
 
-­‐-­‐applications 
file://./app-­‐hive-­‐with-­‐config.json
[ 
{ 
Name: 
Install/Setup 
Presto, 
Path: 
s3://yourbucket/libs/setup-­‐presto.rb, 
Args: 
[ 
-­‐-­‐task_memory, 
1GB, 
-­‐-­‐log-­‐level, 
DEGUB, 
-­‐-­‐version, 
0.75, 
-­‐-­‐presto-­‐repo-­‐url, 
http://central.maven.org/maven2/com/ 
facebook/presto/, 
-­‐-­‐sink-­‐buffer-­‐size, 
1GB, 
-­‐-­‐query-­‐max-­‐age, 
1h, 
-­‐-­‐jvm-­‐config, 
-­‐server 
-­‐Xmx2G 
-­‐XX:+UseConcMarkSweepGC 
-­‐XX: 
+ExplicitGCInvokesConcurrent 
-­‐XX:+CMSClassUnloadingEnabled 
-­‐XX: 
+AggressiveOpts 
-­‐XX:+HeapDumpOnOutOfMemoryError 
-­‐ 
XX:OnOutOfMemoryError=kill 
-­‐9 
%p 
-­‐XX:PermSize=150M 
-­‐ 
XX:MaxPermSize=150M 
-­‐XX:ReservedCodeCacheSize=150M 
-­‐ 
Dhive.config.resources=/home/hadoop/conf/core-­‐site.xml,/home/ 
hadoop/conf/hdfs-­‐site.xml 
] 
} 
]
Ø TFUVQQSFTUPSC㹋䡾כ	IUUQTHJUIVCDPN 
BXTMBCTFNSCPPUTUSBQBDUJPOTCMPCNBTUFS 
QSFTUPJOTUBMM
 
Ø 84ָ㹋꿀涸ח⳿׃ג׷1SFTUP׾.3חⰅ׸׷捀 
ך#PPUTUSBQأؙٔفز 
Ø .*PSדכ⹛ְ׋ֽוծ.*דכ 
⹛ַזַ׏׋	)JWF)JWF
 
Ø 5ISJGU4FSWJDFךه٦زָ殯ז׷׏שְ
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
Ø .FUBTUPSFהכ)JWFךذ٦ـٕ㹀纏瘝ך䞔㜠׾⥂ 
 
㶷׃גֶֻ㜥䨽ךֿה 
Ø 植㖈㢳ֻכ.Z42-ָⵃ欽ׁ׸גְ׷ 
Ø ⡦׮鏣㹀׃זְה.3ך؎ٝأةٝأך.Z42-ח 
⥂㶷ׁ׸׷ 
Ø .FUBTUPSF׾.3㢩鿇ך%#ח鏣㹀׃גֶֻֿהדծ 
.3甧׍♳־׷ꥷח%%-׾ⱄ䏝崧ׁזֻג׮葺ֻ 
ז׷ 
Ø %#⩎ך4FDVSJUZ(SPVQ׾⥜姻ׅ׷䗳銲֮׶
configuration 
property 
BQQIJWFXJUIDPOHKTPO 
namehive.optimize.s3.query/name 
valuetrue/value 
descriptionOptimize 
query 
on 
S3/description 
/property 
property 
namejavax.jdo.option.ConnectionURL/name 
valuejdbc:mysql://hostname:3306/hive?createDatabaseIfNotExist=true/value 
descriptionJDBC 
connect 
string 
for 
a 
JDBC 
metastore/description 
/property 
property 
namejavax.jdo.option.ConnectionDriverName/name 
valuecom.mysql.jdbc.Driver/value 
descriptionDriver 
class 
name 
for 
a 
JDBC 
metastore/description 
/property 
property 
namejavax.jdo.option.ConnectionUserName/name 
valueusername/value 
descriptionUsername 
to 
use 
against 
metastore 
database/description 
/property 
property 
namejavax.jdo.option.ConnectionPassword/name 
valuepassword/value 
descriptionPassword 
to 
use 
against 
metastore 
database/description 
/property 
/configuration
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
Ø 1ZUIPOغحثⳢ椚ⰻד.3׾饯⹛׃׋ְ✲׮֮׷ 
Ø ׮׃ֻכ$FMFSZך5BTLה׃ג饯⹛׃׋ְהַ 
Ø ׉ְֲ׏׋㜥さחכ1ZUIPOך⚥ַ׵.3׾⢪ֲ✲ 
 
׮〳腉 
Ø CPUPFNS׾ⵃ欽ׅ׷ 
Ø BXTDMJⰻַ׵⤑ⵃז6UJMJUZ׾《׏גֹג⢪ֲך׮ 
֮׶ַ׮
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
# 
-­‐*-­‐ 
coding: 
utf-­‐8 
-­‐*-­‐ 
from 
datetime 
import 
datetime 
from 
boto.emr 
import 
connect_to_region 
from 
boto.emr.step 
import 
InstallHiveStep 
def 
setup_emr(): 
# 
need 
to 
export 
AWS_ACCESS_KEY_ID 
and 
AWS_SECRET_ACCESS_KEY 
# 
as 
environment 
variables. 
conn 
= 
connect_to_region('ap-­‐northeast-­‐1') 
install_step 
= 
InstallHiveStep(hive_versions='0.11.0.2') 
jobid 
= 
conn.run_jobflow( 
name='Create 
EMR 
[{}]'.format(datetime.today().strftime('%Y%m%d')), 
log_uri='s3://yourbucket/jobflow_logs/', 
ec2_keyname='your_key', 
master_instance_type='m1.medium', 
slave_instance_type='m1.medium', 
num_instances=3, 
action_on_failure='TERMINATE_JOB_FLOW', 
keep_alive=True, 
enable_debugging=False, 
hadoop_version='2.4.0', 
steps=[install_step], 
bootstrap_actions=[], 
instance_groups=None, 
additional_info=None, 
ami_version='3.1.1', 
api_params=None, 
visible_to_all_users=True, 
job_flow_role=None) 
return 
jobid 
if 
__name__ 
== 
'__main__': 
jobflow_id 
= 
setup_emr() 
print 
JobFlowID: 
{} 
started..format(jobflow_id)
Ø 84ךؙٖرٝءٍٕכا٦أⰻחⰅ׸זְ✲ 
• 橆㞮㢌侧חⰅ׸׷׮װ׭׋倯ָ葺ְ 
• ٗ٦ٕؕوءٝדذأز׃׋ְ㜥さכ䊺׬搀׃ַ 
• .3׾甧׍♳־׷$ח➰♷ׅ׷*.3PMFדⵖ䖴
GSPN UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
YFDVUF 
)JWF2- 
VTF 
.3
jobid 
ꞿֻז׏ג׃ת׏׋ךדꨜ㔲孡׌ֽ 
= 
conn.run_jobflow( 
name='Create 
EMR 
and 
Exec 
hiveql 
[{}]'.format(target_date), 
log_uri='s3://{}/jobflow_logs/'.format(bucket_name), 
ec2_keyname='your_key', 
master_instance_type='m1.medium', 
slave_instance_type='m1.medium', 
num_instances=3, 
action_on_failure='TERMINATE_JOB_FLOW', 
keep_alive=True, 
enable_debugging=False, 
hadoop_version='2.4.0', 
steps=[install_step], 
bootstrap_actions=[], 
instance_groups=None, 
additional_info=None, 
ami_version='3.1.1', 
api_params=None, 
visible_to_all_users=True, 
job_flow_role=None) 
query_files 
= 
['sample01.hql', 
'sample02.hql'] 
hql_steps 
= 
[] 
for 
query_file 
in 
query_files: 
hql_step 
= 
HiveStep( 
name='Executing 
Query 
[{}]'.format(query_file), 
hive_file='s3n://{0}/hive-­‐script/{1}'.format( 
bucket_name, 
query_file), 
hive_versions=hive_version, 
hive_args=['-­‐dTARGET_DATE={0}'.format(target_date), 
'-­‐dBUCKET_NAME={0}'.format(bucket_name)]) 
hql_steps.append(hql_step) 
conn.add_jobflow_steps(jobid, 
hql_steps)
VTF UPEPUIFGPMMPXJOH 
BXTDMJ YFDVUF 
)JWF2- 
YFDVUF 
TEJTUDQ 
$POH 
:PVS.3 
#PPUTUSQ 
1SFTUP 
$SFBUF 
$MVTUFS 
.FUBTUS 
$POH 
1ZUIPO 
4DSJQU 
$SFBUF 
$MVTUFS 
+PC'MPX 
.HNOU 
GSPN 
YFDVUF 
)JWF2- 
.3
Ø غحثⳢ椚ח⣛㶷ꟼ⤘׾⡲׶׋ְ 
• ָ穄׻׏׋׵#ה$ず儗ח㹋遤ׅ׷ծ瘝 
• ה#ָ穄׻׏׋׵$׾㹋遤ׅ׷ծ瘝 
Ø 饯⹛儗꟦ך盖椚׾׮׏ה䩛鯪ח遤ְ׋ְ
• IUUQTHJUIVCDPNTQPUJGZMVJHJ 
• 1ZUIPO醡ךػ؎فٓ؎ٝ盖椚ؿٖ٦يٙ٦ؙ 
• )BEPPQ4USFBNJOH׾ⵃ欽׃׋.BQ3FEVDFָ知⽃ח剅ֽ׷堣圓֮׶ 
• 1ZUIPOך؝٦س׌ֽד⣛㶷䚍鍑寸 
• ⣛㶷䚍〳鋔⻉	ⴽ؟٦ؽأה׃ג甧׍♳־
 
• ⣛㶷䚍〳鋔⻉خ٦ٕכ钠鏾瘝稢ְַ堣腉כ搀ְ 
• )JWF2-ך㹋遤ח㼎䘔׃גְ׷ 
• 1JHך㹋遤ח㼎䘔׃גְ׷ 
• 4ך乼⡲ח㼎䘔׃ג׷ 
• 植朐׌הؔ٦غ٦ٕؗ
• 盖椚歗꬗כ%KBOHP׾ⵃ欽 
• ず♧ך؟٦غדDFMFSZהDFMFSZCFBU׾饯⹛ 
• EKBOHPDFMFSZ׾ⵃ欽׃ג暴㹀ةأؙ׾暴㹀ך儗꟦חُؗ٦חⰅ׸׷״ 
 
ֲח鏣㹀 
• DFMFSZCFBUָُؗ٦חⰅ׏׋ةأؙ׾䭪׏ג㹋遤׃גֻ׸׷ 
• EKBOHPDFMFSZזֻג׮DFMFSZה%KBOHPכ鸬䵿דֹ׷ֽוծֿךأ؛ 
آُ٦ٕ堣腉ָ⤑ⵃזךדת׌⢪׏ג׷
3FGFSFODFT 
Ø IUUQTHJUIVCDPNBXTBXTDMJ 
• 劤㹺ך项俱הا٦أ 
Ø IUUQTHJUIVCDPNCPUPCPUP 
• 劤㹺ך项俱הا٦أ
,BONV 
窫额⟗꟦⹫꧊⚥
ת׆כֶ鑧׌ֽד׮
IUUQTXXXXBOUFEMZDPNQSPKFDUT

More Related Content

What's hot

Presentacion empaques y_embalajes
Presentacion empaques y_embalajesPresentacion empaques y_embalajes
Presentacion empaques y_embalajes
candiazr
 
Informe conciliación1
Informe conciliación1Informe conciliación1
Informe conciliación1
Heidy Balanta
 
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
?? ?
 

What's hot (16)

Shellshock 威脅案例
Shellshock 威脅案例Shellshock 威脅案例
Shellshock 威脅案例
 
1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion
 
Teoriaartegotico
TeoriaartegoticoTeoriaartegotico
Teoriaartegotico
 
LCU14 LNG demo posters_published2014sep18
LCU14 LNG demo posters_published2014sep18LCU14 LNG demo posters_published2014sep18
LCU14 LNG demo posters_published2014sep18
 
Sthaulya ppt
Sthaulya pptSthaulya ppt
Sthaulya ppt
 
Dec 2090 honorarios sca
Dec 2090 honorarios scaDec 2090 honorarios sca
Dec 2090 honorarios sca
 
Presentacion empaques y_embalajes
Presentacion empaques y_embalajesPresentacion empaques y_embalajes
Presentacion empaques y_embalajes
 
Bihar board syllabus science 11 & 12
Bihar board syllabus  science 11 & 12Bihar board syllabus  science 11 & 12
Bihar board syllabus science 11 & 12
 
Cultura organizacional y mejora educativa
Cultura organizacional y mejora educativaCultura organizacional y mejora educativa
Cultura organizacional y mejora educativa
 
US-40
US-40US-40
US-40
 
Barkatmay dinguli
Barkatmay dinguli Barkatmay dinguli
Barkatmay dinguli
 
netsfordummies
netsfordummiesnetsfordummies
netsfordummies
 
Informe conciliación1
Informe conciliación1Informe conciliación1
Informe conciliación1
 
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
Cooperative Localization Based on Received Signal Strength in Wireless Sensor...
 
Eksergian
EksergianEksergian
Eksergian
 
Quran in tamil
Quran in tamilQuran in tamil
Quran in tamil
 

Similar to Python + Hive on AWS EMR で貧者のログサマリ

&DPO $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
&DPO  $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx&DPO  $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
&DPO $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
mayank272369
 
How to use pinterest for business
How to use pinterest for businessHow to use pinterest for business
How to use pinterest for business
Dana Dombrovska
 
Catalogo pajero alternador, motor de partida
Catalogo pajero   alternador, motor de partidaCatalogo pajero   alternador, motor de partida
Catalogo pajero alternador, motor de partida
Victor Pinheiro
 

Similar to Python + Hive on AWS EMR で貧者のログサマリ (20)

Shellshock 威脅案例
Shellshock 威脅案例Shellshock 威脅案例
Shellshock 威脅案例
 
GruntJS 로 개발프로세스 구축하기
GruntJS 로 개발프로세스 구축하기GruntJS 로 개발프로세스 구축하기
GruntJS 로 개발프로세스 구축하기
 
Organizacion noviembre
Organizacion noviembreOrganizacion noviembre
Organizacion noviembre
 
L#03=double entry system
L#03=double entry systemL#03=double entry system
L#03=double entry system
 
Thesis
ThesisThesis
Thesis
 
Crowd-Powered Parameter Analysis for Visual Design Exploration (UIST 2014)
Crowd-Powered Parameter Analysis for Visual Design Exploration (UIST 2014)Crowd-Powered Parameter Analysis for Visual Design Exploration (UIST 2014)
Crowd-Powered Parameter Analysis for Visual Design Exploration (UIST 2014)
 
Go for web
Go for webGo for web
Go for web
 
WordPress Security: Be a Superhero - WordCamp Raleigh - May 2011
WordPress Security: Be a Superhero - WordCamp Raleigh - May 2011WordPress Security: Be a Superhero - WordCamp Raleigh - May 2011
WordPress Security: Be a Superhero - WordCamp Raleigh - May 2011
 
Oracle switch over_back
Oracle switch over_backOracle switch over_back
Oracle switch over_back
 
Oracle switch over_back
Oracle switch over_backOracle switch over_back
Oracle switch over_back
 
모바일 웹 디버깅
모바일 웹 디버깅모바일 웹 디버깅
모바일 웹 디버깅
 
Coding Guidelines
Coding GuidelinesCoding Guidelines
Coding Guidelines
 
&DPO $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
&DPO  $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx&DPO  $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
&DPO $SPTT 4FDUJPO.VMUJQMF 3FHSFTTJPO1SPG +BTPO .docx
 
第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR
第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR
第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR
 
Revista Digital Esika Junho 2013
Revista Digital Esika Junho 2013Revista Digital Esika Junho 2013
Revista Digital Esika Junho 2013
 
Biw learning in the new normal
Biw learning in the new normalBiw learning in the new normal
Biw learning in the new normal
 
How to use pinterest for business
How to use pinterest for businessHow to use pinterest for business
How to use pinterest for business
 
Catalogo pajero alternador, motor de partida
Catalogo pajero   alternador, motor de partidaCatalogo pajero   alternador, motor de partida
Catalogo pajero alternador, motor de partida
 
AmbientのデータをNoodlで受信して可視化してみた
AmbientのデータをNoodlで受信して可視化してみたAmbientのデータをNoodlで受信して可視化してみた
AmbientのデータをNoodlで受信して可視化してみた
 
Zabbixとjob scheduler連携による運用システムoss化の実現
Zabbixとjob scheduler連携による運用システムoss化の実現Zabbixとjob scheduler連携による運用システムoss化の実現
Zabbixとjob scheduler連携による運用システムoss化の実現
 

Recently uploaded

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Recently uploaded (20)

PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 

Python + Hive on AWS EMR で貧者のログサマリ

  • 3. BDIJLV /BNFLJSB$IJLV 5XJUUFS!@BDIJLV (JU)VC!BDIJLV 馯㄂拦莸 LJSB$IJLV'JSFד嗚稊 耵噟ؒٝآص،!LBONV
  • 4. (PBM 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ Ø չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO
  • 6. ,BONV #VTJOFTT Ø ؕ٦س⠓爡ה⼿噟׃׋寸幥ر٦ةⴓ匿 Ø ؕ٦سח秡בֻؙ٦هٝךꂁ⥋ Ø $BSE-JOLFE0FS $-0
  • 8. $BSE-JOLFE0FS չ䏄ךؙ٦هٝؒٝزٔ٦׃ ׋ؕ٦سד顠ְ暟ׅ׸לه ؎ٝز؜حزկպ ؕ٦س ⠓爡 ,BONV չְֲֲֶֿ㹏ׁ׿ח ְֲֲֿؙ٦هٝ⳿׃׋ְպ չְֲֲֿ飑顠⫘ぢךֶ㹏ׁ׿ך倯 ְְָךדכպ չֿ׿ז穠卓׌׏׋ךדծ如㔐כֿ ְֲֲإًؚٝزⴖ׶ת׃׳ֲպ ؕ٦س ⠓㆞ ֶ䏄
  • 9. 2VJDL4VSWFZ Ø ؚٗⴓ匿חꟼ׻׏גְ׵׏׃ׯ׷倯 Ø )BEPPQ⢪׏ג׵׏׃ׯ׷倯 Ø )JWF⢪׏ג׵׏׃ׯ׷倯 Ø .3⢪׏ג׵׏׃ׯ׷倯
  • 14. 1PPSNBOˏT Ø ➙֮׷植朐׾⯋ח Ø 満ٔا٦أ ➂儗꟦穗꿀 ד湡涸׾麦䧭ׅ׷㪦⹲ Ø 湡涸׾麦䧭ׅ׷أؾ٦س׾〳腉זꣲ׶♳־׷㪦⹲ Ø 搀欽ז佄⳿׾鼘ֽ׷㪦⹲
  • 15. ,BONV OHJOFFS5FBN NBLJ $0OHJOFFS @JEFZVUB %FTJHOFS NPRBEB OHJOFFS @BDIJLV OHJOFFS 爡ꞿ噟灇瑔Ꟛ涪 رؠ؎ٝؿٗٝز أوم،فؚٔٗⴓ匿 ؿٗٝزغحؙؒٝس ؎ٝؿٓأوم،فٔ غحؙؒٝس؎ٝؿٓ ⴓ匿㛇湍ؚٗⴓ匿㼎ػ٦زش٦璞〡
  • 17. 剢 ꬊ㖇簭 • ؟٦ؽأך䧭ꞿהⰟח㟓ִ׷鋅鴥׫ • 剢⽃⡘ד،سمحؙזؙؒٔ׮䫎־׋ְ Ø 爡ⰻח㣐鋉垷ر٦ة׾Ⳣ椚ׅ׷濼鋅׾顕׭׋ְ • չل٦أꂁⴓ׾׃׋♳דպ濼鋅׾顕׭׷ • 㢩鿇ח⳿׃חְֻإٝءذ؍ـזر٦ة׮㶷㖈 Ø 麊欽؝أزⴱ劍䫎项׾⡚ֻ䫇ִ׋ְ
  • 18. /PU3FRVJSFNFOUT Ø ⴓ匿ָٔ،ٕة؎يד֮׷䗳銲䚍כ植朐넝ֻזְ Ø ،سمحؙⴓ匿㛇湍ך؟٦ؽأٖكٕכ寸׃ג넝ֻזְ • 兛鸐ךغحثⳢ椚כ衅׍גכ꼽湡׌ֽו • 䌢חⵃ欽〳腉ז朐䡾חז׏גְזֻג׮葺ְ • ⵃ欽כ爡ⰻחꣲ㹀ׁ׸גְ׷ Ø ׋׌׃ծ♳鎸ָ3FRVJSFNFOUTחז׷〳腉䚍כ⼧ⴓ剣׷
  • 20. 84.3 Ø 侧֮׷84؟٦ؽأךֲ׍ך♧א Ø )BEPPQװ)BEPPQؒ؝ءأذيⰻך48ָر ؿٕؓزדⵃ欽〳腉 Ø 1*ד饯⹛ծ+PCך㹋遤ծ⨡姺׾乼⡲〳腉 Ø ٌصةؚٔٝ瘝׮״׃זח㹋倵׃גֻ׸׷ Ø 4׾)%'4ך剏׶חⵃ欽〳腉 Ø ؙٓأةך〴侧㢌刿ָ㺁僒
  • 21. SDIJUFDUVSF 盖椚؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ ؙ٦هٝ ꂁ⥋؟٦غ • ꂁ⥋؟٦غ♳ך'MVFOUEדؚٗ꧊《 • VFOUETQMVHJOד ꧊׃׋ؚٗ׾ 4♳ח⥂㶷 • .3♳ך)JWFדؚٗ׾⸇䊨ծ꧊鎘 • ꧊鎘⦼׾3%4ח⥂㶷׃ג〳鋔⻉
  • 22. %BUBOBMZTJT'MPX CZUBHPNPSJT 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F ⳿ⰩIUUQXXXTMJEFTIBSFOFUUBHPNPSJTIBOEMJOHOPUTPCJHEBUB
  • 23. 1PPSNBOˏT%BUBOBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F
  • 24. $PMMFDU 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F
  • 26. VFOUET QMVHJO׾ⵃ欽׃גؚٗ׾굲לׅ Ø 굲לؚׅٗכِ٦ؠך،ؙءّٝ׾2VFSZ4USJOH חろ׭ג굲לׅ • 醱꧟ז+40/כ굲לׁ׆ծ2VFSZ4USJOHח䞔㜠鯹ׇ׷ • )JWFדך꧊鎘儗חⰋג+40/ח㢌䳔 • IUUQTFYBNQMFDPNCFBDPO TVCPCKDPVQPOBDUJPODMJDLDJE Ø 'MVFOUE꧊秈؟٦غכⵃ欽׃זְ • ٔ،ٕة؎ي꧊鎘ך䗳銲䚍כ植朐넝ֻזְ • ⱔꞿ圓䧭׮罋ִילז׵׆醱꧟חז׷ • 4ך㸜㹀䠬חֶ⟣ׇ׃׋ְ
  • 27. 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 4UPSF
  • 28. 4UPSF Ø ה׶ִ֮׆4ח굲לׅ Ø 4ךغ؛حزכ劤殢嗚鏾דⴓֽגֶֻ • غ؛حز⽃⡘ד،ؙإأ؝ٝزٗ٦ٕ〳腉 • FYBNQMFDPNQSPEVDUJPOMPH Ø ؟٦غ䕵ⶴⴽחؗ٦׾ⴓֽגֶֻ • ⴽ؟٦غָ㟓ִג׮㸜䗰 • FYBNQMFDPNQSPEVDUJPOMPHBQJ Ø 傈ⴽחؗ٦׾ⴓֽגֶֻ • )JWFךػ٦ذ؍ءّٝ׾ⵃ欽ׅ׷捀 • FYBNQMFDPNQSPEVDUJPOMPHBQJEU
  • 29. 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 1SPDFTT
  • 30. 1SPDFTT Ø ⥋걾ה㹋籐ך㢸꟦غحث • 盖椚؟٦غַ׵)BEPPQ
  • 31. )JWFך.3׾饯⹛ • 'MVFOUE꧊秈؟٦غ׾ⵃ欽׃גְזְ捀稢ⴖ׸הז׏׋ؚٗؿ؋؎ ٕ׾㖇簭ծ穠さ )BEPPQכ稢ⴖ׸㼭ְׁؿ؋؎ٕךⳢ椚蕱䩛 • ؚٗח鎸ꐮׁ׸גְ׷2VFSZ4USJOH׾6%'׾ⵃ欽׃ג+40/ח㢌䳔 • 鋅׷ץֹ鯥ד꧊鎘׃ג⥂㶷 • ♳鎸Ⰻגך1SPDFTT׾)%'4חر٦ة׾衅הׁ׆4׾ⵃ欽׃ג㹋遤 • 剑穄涸ז꧊鎘⦼׾3%4ח呓秛 Ø 厫鮾ד鸞ְ儎꟦ؙؒٔ • 盖椚؟٦غַ׵)BEPPQ
  • 32. )JWF
  • 34. 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F 7JTVBMJ[F
  • 35. 7JTVBMJ[F Ø .3ד꧊鎘׃׋ر٦ة׾.Z42-חٗ٦س Ø 盖椚؟٦غ♳ד⹛ֻ؟٦ؽأ׾ⵃ欽׃ג⦼׾〳鋔⻉ • ًٝغ٦Ⰻ㆞ָずׄ⦼׾鋅ג侧⦼然钠 Ø ⡭׏ג׷爡ⰻ؟٦غח鑐꿀涸חMBTUJDTFBSDI
  • 36. ,JCBOB׾ 㼪Ⰵ • ر٦ة׾䒚׶זָ׵ⴓ匿鯥׾罋ִ׋ְ儗ח⤑ⵃ
  • 37. 1PPSNBOˏT%BUBOBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F
  • 38. :(/*
  • 41. 1PPSNBOˏT%BUBOBMZTJT'MPX 1SPDFTT $PMMFDU 1BSTF $MFBOVQ 4UPSF 1SPDFTT 7JTVBMJ[F
  • 42. 3FGFSFODFT Ø 84NB[PO.3#FTU1SBDUJDFT • ؝ٖ׾铣׭ל荈ⴓ麦ך؝ٝذؙأزחさ׏׋.3圓䧭ָ׻ַ׷կ )BEPPQךⰅꟌה׃ג׮葺ְךדכկ Ø NJYJך鍑匿㛇湍הQBDIF)JWFדך+40/ػ٦؟ ך崞欽ך稱➜ • +40/ד顕׭ג7JFXדذ٦ـٕ׏שֻ䪔ֲ،؎ر؍،׾顗׏׋կؚٗ ꧊鎘חꟼ׻׷➂麦ך؝ىُص؛٦ءّٝ؝أزծהְֲ嚊䙀׮顗׏׋կ Ø #BUDI1SPDFTTJOHBOE4USFBN1SPDFTTJOHCZ42- • ֿךز٦ؙ׾耀ְגⴓ匿㛇湍ח.11禸ؒٝآٝ׾ⵃ欽ׅ׷✲׾寸䠐կ *NQBMBה1SFTUP׾嫰鯰׃ծ4ח׮湫䱸ؙؒٔ׾䫎־׸׷1SFTUP׾㼪 Ⰵ׃׋կ *NQBMB׮如劍غ٦آّٝדכ4ח湫䱸ؙؒٔ䫎־׸׷׵׃ ְךד׉ך儗חⱄ䏝嗚鏾✮㹀
  • 45. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 46. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 47. BXTDMJ Ø ٔٔ٦أך7FSַ׵.3堣腉ך1SFWJFX أذ٦ةأָ《׸ծ兦׸ג㸜㹀׃׋1*ה׃גⵃ欽〳腉 Ø ➙תדرؿ؋ؙز׌׏׋3VCZךMBTUJD.BQ3FEVDFأؙ ٔفزַ׵⛦׶䳔ִ • QJQד知⽃ח؎ٝأز٦ٕדֹ׷ • ⟃⵸ַ׵BXTDMJ׾⢪׏ג׷ךדخ٦ٕ窟♧ • (JU)VC♳דךꟚ涪ָ崞涪ד13׮⳿ׇ׷
  • 49. $ mkvirtualenv pycon-­‐emr-­‐dev (pycon-­‐emr-­‐dev)$ pip install awscli (pycon-­‐emr-­‐dev)$ mkdir ~/.awscli (pycon-­‐emr-­‐dev)$ cat -­‐EOF ~/.awscli/config [profile development] aws_access_key_id=development_access_key aws_secret_access_key=development_secret_key region=ap-­‐northeast-­‐1 EOF (pycon-­‐emr-­‐dev)$ cat -­‐EOF $VIRTUAL_ENV/bin/activate export AWS_CONFIG_FILE=~/.awscli/config export AWS_DEFAULT_PROFILE=development source aws_zsh_completer.sh EOF
  • 50. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 51. $ aws emr create-­‐cluster -­‐-­‐ami-­‐version 3.1.1 -­‐-­‐name 'PyConJP 2014 (AMI 3.1.1 Hive)' -­‐-­‐tags Name=pycon-­‐jp-­‐emr environment=development -­‐-­‐ec2-­‐attributes KeyName=yourkey -­‐-­‐log-­‐uri 's3://yourbucket/jobflow_logs/' -­‐-­‐no-­‐auto-­‐terminate -­‐-­‐visible-­‐to-­‐all-­‐users -­‐-­‐instance-­‐groups file://./normal-­‐instance-­‐setup.json -­‐-­‐applications file://./app-­‐hive.json
  • 52. [ { OPSNBMJOTUBODFHSPVQKTPO BQQIJWFKTPO Name: emr-­‐master, InstanceGroupType: MASTER, InstanceCount: 1, InstanceType: m1.medium }, { Name: emr-­‐core, InstanceGroupType: CORE, InstanceCount: 2, InstanceType: m1.medium } ] [ { Name: HIVE } ]
  • 53. SFTVMU { ClusterId: j-­‐8xxxxxxxxx }
  • 54. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 55. $ aws emr add-­‐steps -­‐-­‐cluster-­‐id j-­‐8xxxxxxxxx -­‐-­‐steps file://./hive-­‐sample-­‐step-­‐1.json
  • 56. [ { IJWFTBNQMFTUFQKTPO Args: [ -­‐f, s3n://yourbucket/hive-­‐script/sample01.hql, -­‐d, BUCKET_NAME=yourbucket, -­‐d, TARGET_DATE=20140818 ], ActionOnFailure: CONTINUE, Name: Hive Sample Program 01, Type: HIVE }, { Args: [ -­‐f, s3n://yourbucket/hive-­‐script/sample02.hql, -­‐d, BUCKET_NAME=yourbucket, -­‐d, TARGET_DATE=20140818 ], ActionOnFailure: CONTINUE, Name: Hive Sample Program 02, Type: HIVE } ]
  • 57. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 58. $ aws emr add-­‐steps -­‐-­‐cluster-­‐id j-­‐8xxxxxxxxx -­‐-­‐steps file://./s3distcp-­‐sample-­‐step.json
  • 59. [ { TEJTUDQTBNQMFTUFQKTPO Name: s3distcp Sample, ActionOnFailure: CONTINUE, Jar: /home/hadoop/lib/emr-­‐s3distcp-­‐1.0.jar, Type: CUSTOM_JAR, Args: [ -­‐-­‐src, s3n://yourbucket/access_log/dt=20140818, -­‐-­‐dest, s3n://yourbucket/compressed_log/dt=20140818, -­‐-­‐groupBy, .*(nginx_access_log-­‐).*, -­‐-­‐targetSize, 100, -­‐-­‐outputCodec, gzip ] } ]
  • 60. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 61. $ aws emr create-­‐cluster -­‐-­‐ami-­‐version 3.1.1 -­‐-­‐name 'PyConJP 2014 (AMI 3.1.1 Hive)' -­‐-­‐tags Name=pycon-­‐jp-­‐emr environment=development -­‐-­‐ec2-­‐attributes KeyName=yourkey -­‐-­‐log-­‐uri 's3://yourbucket/jobflow_logs/' -­‐-­‐no-­‐auto-­‐terminate -­‐-­‐visible-­‐to-­‐all-­‐users -­‐-­‐instance-­‐groups file://./normal-­‐instance-­‐setup.json -­‐-­‐applications file://./app-­‐hive-­‐with-­‐config.json
  • 62. [ { BQQIJWFXJUIDPOHKTPO Args: [ -­‐-­‐hive-­‐site=s3://yourbucket/libs/config/hive-­‐site.xml ], Name: HIVE } ]
  • 63. IJWFTJUFYNM ?xml version=1.0? ?xml-­‐stylesheet type=text/xsl href=configuration.xsl? configuration property namehive.optimize.s3.query/name valuetrue/value descriptionOptimize query on S3/description /property /configuration
  • 64. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 65. $ aws emr create-­‐cluster -­‐-­‐ami-­‐version 3.1.1 -­‐-­‐name 'PyConJP 2014 (AMI 3.1.1 Hive + Presto)' -­‐-­‐tags Name=pycon-­‐jp-­‐emr environment=development -­‐-­‐ec2-­‐attributes KeyName=yourkey -­‐-­‐log-­‐uri 's3://yourbucket/jobflow_logs/' -­‐-­‐no-­‐auto-­‐terminate -­‐-­‐visible-­‐to-­‐all-­‐users -­‐-­‐instance-­‐groups file://./normal-­‐instance-­‐setup.json -­‐-­‐bootstrap-­‐actions file://./bootstrap-­‐presto.json -­‐-­‐applications file://./app-­‐hive-­‐with-­‐config.json
  • 66. [ { Name: Install/Setup Presto, Path: s3://yourbucket/libs/setup-­‐presto.rb, Args: [ -­‐-­‐task_memory, 1GB, -­‐-­‐log-­‐level, DEGUB, -­‐-­‐version, 0.75, -­‐-­‐presto-­‐repo-­‐url, http://central.maven.org/maven2/com/ facebook/presto/, -­‐-­‐sink-­‐buffer-­‐size, 1GB, -­‐-­‐query-­‐max-­‐age, 1h, -­‐-­‐jvm-­‐config, -­‐server -­‐Xmx2G -­‐XX:+UseConcMarkSweepGC -­‐XX: +ExplicitGCInvokesConcurrent -­‐XX:+CMSClassUnloadingEnabled -­‐XX: +AggressiveOpts -­‐XX:+HeapDumpOnOutOfMemoryError -­‐ XX:OnOutOfMemoryError=kill -­‐9 %p -­‐XX:PermSize=150M -­‐ XX:MaxPermSize=150M -­‐XX:ReservedCodeCacheSize=150M -­‐ Dhive.config.resources=/home/hadoop/conf/core-­‐site.xml,/home/ hadoop/conf/hdfs-­‐site.xml ] } ]
  • 67. Ø TFUVQQSFTUPSC㹋䡾כ IUUQTHJUIVCDPN BXTMBCTFNSCPPUTUSBQBDUJPOTCMPCNBTUFS QSFTUPJOTUBMM Ø 84ָ㹋꿀涸ח⳿׃ג׷1SFTUP׾.3חⰅ׸׷捀 ך#PPUTUSBQأؙٔفز Ø .*PSדכ⹛ְ׋ֽוծ.*דכ ⹛ַזַ׏׋ )JWF)JWF Ø 5ISJGU4FSWJDFךه٦زָ殯ז׷׏שְ
  • 68. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 69. Ø .FUBTUPSFהכ)JWFךذ٦ـٕ㹀纏瘝ך䞔㜠׾⥂ 㶷׃גֶֻ㜥䨽ךֿה Ø 植㖈㢳ֻכ.Z42-ָⵃ欽ׁ׸גְ׷ Ø ⡦׮鏣㹀׃זְה.3ך؎ٝأةٝأך.Z42-ח ⥂㶷ׁ׸׷ Ø .FUBTUPSF׾.3㢩鿇ך%#ח鏣㹀׃גֶֻֿהדծ .3甧׍♳־׷ꥷח%%-׾ⱄ䏝崧ׁזֻג׮葺ֻ ז׷ Ø %#⩎ך4FDVSJUZ(SPVQ׾⥜姻ׅ׷䗳銲֮׶
  • 70. configuration property BQQIJWFXJUIDPOHKTPO namehive.optimize.s3.query/name valuetrue/value descriptionOptimize query on S3/description /property property namejavax.jdo.option.ConnectionURL/name valuejdbc:mysql://hostname:3306/hive?createDatabaseIfNotExist=true/value descriptionJDBC connect string for a JDBC metastore/description /property property namejavax.jdo.option.ConnectionDriverName/name valuecom.mysql.jdbc.Driver/value descriptionDriver class name for a JDBC metastore/description /property property namejavax.jdo.option.ConnectionUserName/name valueusername/value descriptionUsername to use against metastore database/description /property property namejavax.jdo.option.ConnectionPassword/name valuepassword/value descriptionPassword to use against metastore database/description /property /configuration
  • 71. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 72. Ø 1ZUIPOغحثⳢ椚ⰻד.3׾饯⹛׃׋ְ✲׮֮׷ Ø ׮׃ֻכ$FMFSZך5BTLה׃ג饯⹛׃׋ְהַ Ø ׉ְֲ׏׋㜥さחכ1ZUIPOך⚥ַ׵.3׾⢪ֲ✲ ׮〳腉 Ø CPUPFNS׾ⵃ欽ׅ׷ Ø BXTDMJⰻַ׵⤑ⵃז6UJMJUZ׾《׏גֹג⢪ֲך׮ ֮׶ַ׮
  • 73. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 74. # -­‐*-­‐ coding: utf-­‐8 -­‐*-­‐ from datetime import datetime from boto.emr import connect_to_region from boto.emr.step import InstallHiveStep def setup_emr(): # need to export AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY # as environment variables. conn = connect_to_region('ap-­‐northeast-­‐1') install_step = InstallHiveStep(hive_versions='0.11.0.2') jobid = conn.run_jobflow( name='Create EMR [{}]'.format(datetime.today().strftime('%Y%m%d')), log_uri='s3://yourbucket/jobflow_logs/', ec2_keyname='your_key', master_instance_type='m1.medium', slave_instance_type='m1.medium', num_instances=3, action_on_failure='TERMINATE_JOB_FLOW', keep_alive=True, enable_debugging=False, hadoop_version='2.4.0', steps=[install_step], bootstrap_actions=[], instance_groups=None, additional_info=None, ami_version='3.1.1', api_params=None, visible_to_all_users=True, job_flow_role=None) return jobid if __name__ == '__main__': jobflow_id = setup_emr() print JobFlowID: {} started..format(jobflow_id)
  • 75. Ø 84ךؙٖرٝءٍٕכا٦أⰻחⰅ׸זְ✲ • 橆㞮㢌侧חⰅ׸׷׮װ׭׋倯ָ葺ְ • ٗ٦ٕؕوءٝדذأز׃׋ְ㜥さכ䊺׬搀׃ַ • .3׾甧׍♳־׷$ח➰♷ׅ׷*.3PMFדⵖ䖴
  • 76. GSPN UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU YFDVUF )JWF2- VTF .3
  • 77. jobid ꞿֻז׏ג׃ת׏׋ךדꨜ㔲孡׌ֽ = conn.run_jobflow( name='Create EMR and Exec hiveql [{}]'.format(target_date), log_uri='s3://{}/jobflow_logs/'.format(bucket_name), ec2_keyname='your_key', master_instance_type='m1.medium', slave_instance_type='m1.medium', num_instances=3, action_on_failure='TERMINATE_JOB_FLOW', keep_alive=True, enable_debugging=False, hadoop_version='2.4.0', steps=[install_step], bootstrap_actions=[], instance_groups=None, additional_info=None, ami_version='3.1.1', api_params=None, visible_to_all_users=True, job_flow_role=None) query_files = ['sample01.hql', 'sample02.hql'] hql_steps = [] for query_file in query_files: hql_step = HiveStep( name='Executing Query [{}]'.format(query_file), hive_file='s3n://{0}/hive-­‐script/{1}'.format( bucket_name, query_file), hive_versions=hive_version, hive_args=['-­‐dTARGET_DATE={0}'.format(target_date), '-­‐dBUCKET_NAME={0}'.format(bucket_name)]) hql_steps.append(hql_step) conn.add_jobflow_steps(jobid, hql_steps)
  • 78. VTF UPEPUIFGPMMPXJOH BXTDMJ YFDVUF )JWF2- YFDVUF TEJTUDQ $POH :PVS.3 #PPUTUSQ 1SFTUP $SFBUF $MVTUFS .FUBTUS $POH 1ZUIPO 4DSJQU $SFBUF $MVTUFS +PC'MPX .HNOU GSPN YFDVUF )JWF2- .3
  • 79. Ø غحثⳢ椚ח⣛㶷ꟼ⤘׾⡲׶׋ְ • ָ穄׻׏׋׵#ה$ず儗ח㹋遤ׅ׷ծ瘝 • ה#ָ穄׻׏׋׵$׾㹋遤ׅ׷ծ瘝 Ø 饯⹛儗꟦ך盖椚׾׮׏ה䩛鯪ח遤ְ׋ְ
  • 80. • IUUQTHJUIVCDPNTQPUJGZMVJHJ • 1ZUIPO醡ךػ؎فٓ؎ٝ盖椚ؿٖ٦يٙ٦ؙ • )BEPPQ4USFBNJOH׾ⵃ欽׃׋.BQ3FEVDFָ知⽃ח剅ֽ׷堣圓֮׶ • 1ZUIPOך؝٦س׌ֽד⣛㶷䚍鍑寸 • ⣛㶷䚍〳鋔⻉ ⴽ؟٦ؽأה׃ג甧׍♳־ • ⣛㶷䚍〳鋔⻉خ٦ٕכ钠鏾瘝稢ְַ堣腉כ搀ְ • )JWF2-ך㹋遤ח㼎䘔׃גְ׷ • 1JHך㹋遤ח㼎䘔׃גְ׷ • 4ך乼⡲ח㼎䘔׃ג׷ • 植朐׌הؔ٦غ٦ٕؗ
  • 81. • 盖椚歗꬗כ%KBOHP׾ⵃ欽 • ず♧ך؟٦غדDFMFSZהDFMFSZCFBU׾饯⹛ • EKBOHPDFMFSZ׾ⵃ欽׃ג暴㹀ةأؙ׾暴㹀ך儗꟦חُؗ٦חⰅ׸׷״ ֲח鏣㹀 • DFMFSZCFBUָُؗ٦חⰅ׏׋ةأؙ׾䭪׏ג㹋遤׃גֻ׸׷ • EKBOHPDFMFSZזֻג׮DFMFSZה%KBOHPכ鸬䵿דֹ׷ֽוծֿךأ؛ آُ٦ٕ堣腉ָ⤑ⵃזךדת׌⢪׏ג׷
  • 82. 3FGFSFODFT Ø IUUQTHJUIVCDPNBXTBXTDMJ • 劤㹺ך项俱הا٦أ Ø IUUQTHJUIVCDPNCPUPCPUP • 劤㹺ך项俱הا٦أ
  • 87. Ø ⯓鹈ꆃ刑傈儗挿ד遤ֻ׵ְך.BSLEPXO Ø 4MJEFMFTTח䮋䨌׃״ֲה׃׋ Ø 爡ⰻדٖؽُ٦⠓㹋倵
  • 89. Ø ⴱ׭ג䪮遭禸ך涪邌׃׋ Ø ➬✲דװ׏גֹ׋✲׾תה׭׷ְְ堣⠓ Ø ➭ך倯׋׍ָ➬✲׃ג׷儗ח罋ִגְ׷✲׾濼׶׋ְ Ø ➭ך⠓爡ך圓䧭ָז׈׉ך圓䧭׾ה׏גְ׷ךַ濼׶׋ְ
  • 90. (PBM 涺ׁ׿ך鑧׾耀ֹ׋ְ 荈ⴓָ Ø չז׈׉ך圓䧭זךַպח搊挿׾縧ְ׋✲⢽ךⰟ剣 Ø չⰅꟌ⟃♳պ׾湡䭷׃׋1ZUIPO