SlideShare a Scribd company logo
์•„์ฃผ ์‹ฌํ”Œํ•œ ๊ฒ€์ƒ‰์—”์ง„์˜ ์›๋ฆฌ
๊ฐ•๋Œ€๋ช… (CHARSYAM@NAVER.COM)
๊ณ ๋ฐฑ!!!
๏ต์‹ค์ œ๋กœ ๊ฒ€์ƒ‰์—”์ง„ ๊ด€๋ จ ์ผ์„ ํ•ด๋ณธ ๊ฒƒ์€, ํ•™๊ต ์—ฐ๊ตฌ์‹ค
์ด ๊ฒ€์ƒ‰์—”์ง„ ๋งŒ๋“œ๋Š” ์—ฐ๊ตฌ์‹ค์ด๋ผ, ๊ฑฐ๊ธฐ์„œ ์•Œ๋ฐ”๋ฅผ ํ•ด
๋ณธ ๊ฒƒ ๋ฐ–์— ์—†์Šต๋‹ˆ๋‹ค.
๏ต๊ทธ๋ฆฌ๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ์กฐ๊ธˆ ๊ณต๋ถ€ํ•ด๋ณธ ๊ฒƒ๋“ค ๋ฟ์ด์—์š”.
๏ต์ฆ‰ ์ƒ๋‹นํ•œ โ€œ๊ตฌ๋ผโ€ ๊ฐ€ ์„ž์—ฌ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ฃผ์ œ
๏ต๊ฒ€์ƒ‰์—”์ง„์˜ ๋™์ž‘ ์›๋ฆฌ ์ดํ•ดํ•˜๊ธฐ
์™œ?
๏ต๊ฒ€์ƒ‰์—”์ง„์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ???
๏ต์–ด๋–ค ์ •๋ณด๋ฅผ Ranking์— ๋งž์ถฐ์„œ ๊ฐ€์ ธ์˜ฌ ํ•„์š”์„ฑ.
๏ตโ€œํŒŒ์ด์ฌโ€&โ€KOREAโ€ ๋ผ๋Š” ๋‹จ์–ด๋ฅผ ๊ฐ€์ง„ ๋ฌธ์„œ๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ 
์‹ถ๋‹ค๋ฉด?
๊ฒ€์ƒ‰์—”์ง„์ด ํ•„์š”ํ•˜๋‹ค๋ฉด?
๏ตElastic Search ์“ฐ์„ธ์š”.
๏ตSolr ์“ฐ์„ธ์š”.
๊ดœํžˆ ๋งŒ๋“œ๋Š” ๊ฒƒ ๋ณด๋‹ค
์œ„์— ๊ฒƒ๋“ค ์“ฐ์‹œ๋Š” ๊ฒŒ
ํ›จ์”ฌ ์ข‹์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ๋ฐ ์™œ?
๏ต๊ทธ๋ƒฅ ์žฌ๋ฏธ๋กœโ€ฆ
๏ต์žฌ๋ฏธ๋‚œ ์ž๋ฃŒ๋ฅผ ๋ณด๊ณ  ๋‚˜๋‹ˆ ๋‚˜๋„ ์ •๋ฆฌํ•ด๋ณด๊ณ  ์‹ถ์–ด์„œโ€ฆ
๊น€์ข…๋ฏผ๋‹˜์˜ ๋ฐ๋†€ ๋ฐœํ‘œ
๏ตhttps://www.slideshare.net/kjmorc/ss-
80803233
๏ต ์ด๊ฒƒ๋งŒ ๋ณด์…”๋„ ๋ฉ๋‹ˆ๋‹ค.
๏ต์ œ๊ป€ โ€œ๊ตฌ๋ผโ€ ๋ฒ„์ „
๊ฒ€์ƒ‰์—”์ง„์˜ ๊ตฌ์„ฑ ์š”์†Œ-ํ•™์ˆ ์ 
๏ต์ƒ‰์ธ ๊ณผ์ •
๏ต์งˆ์˜ ๊ณผ์ •
์ƒ‰์ธ ๊ณผ์ •-ํ•™์ˆ ์ 
Data
Source
Web, File,
E-Mail
Data
Store
์ƒ‰์ธ ์ƒ์„ฑ
๋ณ€ํ™˜
ํš๋“ Index
์งˆ์˜ ๊ณผ์ • - ํ•™์ˆ ์ 
User
Data
Store
๋žญํ‚น
ํ‰๊ฐ€
User
Interaction
Index
Log
Data
๊ฒ€์ƒ‰์—”์ง„์˜ ๊ตฌ์„ฑ ์š”์†Œ
๏ต์ƒ‰์ธ ๊ณผ์ •
๏ตํฌ๋กค๋ง + ์—ญ์ธ๋ฑ์Šค ๊ตฌ์„ฑ
๏ต์งˆ์˜ ๊ณผ์ •
๏ต์ธ๋ฑ์Šค๋กœ ์ฐพ๊ธฐ + ๋žญํ‚น
ํฌ๋กค๋ง #0
๏ต์›น ํŽ˜์ด์ง€์˜ ์ˆ˜์ง‘
๏ตRequests module
๏ตr = requests.get(โ€˜http://www.naver.comโ€™)
ํฌ๋กค๋ง #1
๏ตํ•ด์•ผํ•  ์งˆ๋ฌธ๋“ค!!! โ€“ ๋น„๊ธฐ์ˆ ์ 
๏ต ํฌ๋กค๋ง์„ ํ•ด๋„๋˜๋‚˜์š”?
๏ตrobots.txt
๏ต ๊ตฌ๊ธ€ ๊ฒ€์ƒ‰๋ด‡์ด ์šฐ๋ฆฌ ์„œ๋ฒ„๋ฅผ ๊ณต๊ฒฉํ•ด์š”.
ํฌ๋กค๋ง #2
๏ต Simple Idea
Redis List
Crawling
Loop
BLPOP
ํฌ๋กค๋ง #3
๏ตํŽ˜์ด์ง€๋ฅผ ๊ฐ€์ ธ์™”์œผ๋ฉด?
๏ต๋งํฌ ์ถ”์ถœ
๏ตMeaningful params์˜ ์ถ”์ถœ
๏ต์ธ์ฝ”๋”ฉ ๋ณ€๊ฒฝ
๏ตํ…์ŠคํŠธ ์ถ”์ถœ(ํƒœ๊ทธ ์ œ๊ฑฐ)
ํฌ๋กค๋ง #4
๏ต๊ฐ™์€ ํŽ˜์ด์ง€๋ฅผ ์žฌ๋ฐฉ๋ฌธํ•ด์•ผ ํ• ๊นŒ?
๏ต์žฌ๋ฐฉ๋ฌธ ํ•˜์ง€ ์•Š์•„์•ผ ํ•œ๋‹ค๋ฉด?
๏ต์–ด๋–ป๊ฒŒ ๊ธฐ๋ก์„ ํ•ด๋‘˜๊นŒ?
๏ต์žฌ๋ฐฉ๋ฌธ ํ•ด์•ผ ํ•œ๋‹ค๋ฉด? ๋ช‡์ผ๋งˆ๋‹ค?
ํฌ๋กค๋ง #5
๏ต์ €์žฅ์„ ํ•ด์•ผํ•˜๋‚˜?
๏ต์–ด๋””์— ์ €์žฅํ•  ๊ฒƒ์ธ๊ฐ€?
๏ต๋ถ„์‚ฐ ํŒŒ์ผ ์‹œ์Šคํ…œ?
๏ต๊ตฌ๊ธ€์ด ์ด๋Ÿด๋ ค๊ณ  BigTable ๋งŒ๋“ฌ.(Hbase, Cassandra or
column oriented storage)
๏ตDB?
๏ต์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•ด์•ผ ํ•˜๋‚˜?
๏ต์›๋ณธ?, ๋ณ€ํ™˜ํ•ด์„œ?
ํฌ๋กค๋ง #6
๏ต Simple Idea
Queue
Crawling
Loop
๋ฐฉ๋ฌธํ•  ์ฃผ์†Œ
Storage
๋ฐฉ๋ฌธ ์‹œ๊ฐ„/์ฃผ๊ธฐ
ํฌ๋กค๋ง ๋ฐ์ดํ„ฐ
์ƒ‰์ธ #0
๏ต๋‹ค์Œ ๋ฌธ์„œ๋“ค์„ ์ƒ‰์ธ(Indexing)ํ•œ๋‹ค๋ฉด?
The bright blue butterfly hangs on the breeze
It's best to forget the great sky and to retire from
every wind
Under blue sky, in bright sunlight, one need not search
around
์ƒ‰์ธ #1 - tokenizing
๏ต๋จผ์ € ๋‹จ์–ด๋ณ„๋กœ ๋‚˜๋ˆˆ๋‹ค.
DOC1 The
bright
blue
butterfly
hangs
on
the
breeze
DOC2 Itโ€™s to
best retire
to from
forget every
the wind.
great
sky
and
DOC3 Under not
blue search
sky around.
in
bright
sunlight
one
need
์ƒ‰์ธ #2 - ๋ณ€ํ™˜
๏ตํŠน์ˆ˜๋ฌธ์ž ์ œ๊ฑฐ
DOC1 The
bright
blue
butterfly
hangs
on
the
breeze
DOC2 Itโ€™s to
best retire
to from
forget every
the wind
great
sky
and
DOC3 Under not
blue search
sky around
in
bright
sunlight
one
need
์ƒ‰์ธ #3 โ€“ ์—ญ์ธ๋ฑ์Šค
๏ต๋ฌธ์„œ -> ๋‹จ์–ด ์—์„œ ๋‹จ์–ด -> ๋ฌธ์„œ๋กœ ๋ณ€ํ™˜
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
The 1 Itโ€™s 2 to 2 in 3
bright 1 best 2 retire 2 bright 3
blue 1 to 2 from 2 sunlight 3
butterfly 1 forget 2 every 2 one 3
hangs 1 the 2 wind 2 need 3
on 1 great 2 Under 3 not 3
the 1 sky 2 blue 3 search 3
breeze 1 and 2 sky 3 around 3
์ƒ‰์ธ #4 โ€“ ๊ฐ™์€ ๋‹จ์–ด ํ•ฉ์น˜๊ธฐ
๏ต๊ฐ™์€ ๋‹จ์–ด๋ฅผ ํ•ฉ์น˜๊ธฐ ์œ„ํ•ด์„œ ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ?
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
The 1 Itโ€™s 2 to 2 in 3
bright 1 best 2 retire 2 bright 3
blue 1 to 2 from 2 sunlight 3
butterfly 1 forget 2 every 2 one 3
hangs 1 the 2 wind 2 need 3
on 1 great 2 Under 3 not 3
the 1 sky 2 blue 3 search 3
breeze 1 and 2 sky 3 around 3
์ƒ‰์ธ #4-1 โ€“ ๋Œ€์†Œ๋ฌธ์ž ๋ณ€ํ™˜
๏ต๋Œ€๋ฌธ์ž๋ฅผ ์†Œ๋ฌธ์ž๋กœ.
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
the 1 itโ€™s 2 to 2 in 3
bright 1 best 2 retire 2 bright 3
blue 1 to 2 from 2 sunlight 3
butterfly 1 forget 2 every 2 one 3
hangs 1 the 2 wind 2 need 3
on 1 great 2 under 3 not 3
the 1 sky 2 blue 3 search 3
breeze 1 and 2 sky 3 around 3
์ƒ‰์ธ #4-2 โ€“ ์ •๋ ฌ
๏ต๋‹ค์Œ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด์„œ ์ •๋ ฌ๋„ ํ•˜๊ฒŒ๋จ
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
and 2 butterfly 1 need 3 sunlight 3
around 3 every 2 not 3 the 1
best 2 forget 2 on 1 the 1
blue 1 from 2 one 3 the 2
blue 3 great 2 retire 2 to 2
breeze 1 hangs 1 search 3 to 2
bright 1 in 3 sky 2 under 3
bright 3 itโ€™s 2 sky 3 wind 2
์ƒ‰์ธ #4-3 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ
๏ต๋„ˆ๋ฌด ํ”ํ•ด์„œ ์•ˆ ์“ฐ๋Š”๊ฑธ ์ง€์šฐ์ž.
๏ต๊ฒ€์ƒ‰์–ด๋กœ์˜ ๊ฐ€์น˜๊ฐ€ ์—†์Œ
a not
and on
around one
every the
for to
from under
in โ€ฆ
it โ€ฆ
itโ€™s โ€ฆ
์ƒ‰์ธ #4-4 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ
๏ต์•ˆ์“ฐ๋Š” ๋‹จ์–ด ์‚ญ์ œ
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
butterfly 1 need 3 sunlight 3
best 2 forget 2
blue 1
blue 3 great 2 retire 2
breeze 1 hangs 1 search 3
bright 1 sky 2
bright 3 sky 3 wind 2
์ƒ‰์ธ #4-5 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
best 2 great 2 wind 2
blue 1 hangs 1
blue 3 need 3
breeze 1 retire 2
bright 1 search 3
bright 3 sky 2
butterfly 1 sky 3
forget 2 sunlight 3
์ƒ‰์ธ #4-6 โ€“ Stemming
๏ต๋™์‚ฌ๋ฅผ ์›ํ˜•์œผ๋กœ -> ์–ด๊ฐ„/์–ด๋ฏธ๋ฅผ ๋ถ„๋ฆฌํ•ด์„œ ์–ด๊ฐ„๋งŒ
๋‚จ๊ธฐ๋Š”(~s, ~es, ~ed, ~ing ๋“ฑ๋“ฑ๋“ฑ ์ œ๊ฑฐ)
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
best 2 great 2 wind 2
blue 1 hang 1
blue 3 need 3
breeze 1 retire 2
bright 1 search 3
bright 3 sky 2
butterfly 1 sky 3
forget 2 sunlight 3
์ƒ‰์ธ #4-7 โ€“ ํ•ฉ์น˜๊ธฐ
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
best 2 great 2 wind 2
blue 1,3 hang 1
need 3
breeze 1 retire 2
bright 1,3 search 3
sky 2,3
butterfly 1
forget 2 sunlight 3
์ƒ‰์ธ #4-8 โ€“ ํ•ฉ์น˜๊ธฐ
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
best 2 need 3
blue 1,3 retire 2
breeze 1 search 3
bright 1,3 sky 2,3
butterfly 1 sunlight 3
forget 2 wind 2
great 2
hang 1
์ƒ‰์ธ #5
์ƒ‰์ธ #6
๏ต์˜์–ด๋Š” ์ข€ ์‰ฌ์šด๋ฐ, ํ•œ๊ตญ์–ด๋Š”?
๏ต ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋กœ ๋ถ„ํ•ด๋œ ๋‹จ์–ด๋งŒ ์ €์žฅ
๏ต ์˜คํ”ˆ์†Œ์Šค ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ(python ์ง€์›, ์‚ฌ์ „๊ธฐ๋ฐ˜)
๏ต์€์ „ํ•œ๋‹ข ํ”„๋กœ์ ํŠธ(mecab ๊ธฐ๋ฐ˜)
๏ตKonlpy
๏ตKomoran
์ƒ‰์ธ #7
๏ตN๊ทธ๋žจ
๏ต ํ˜•ํƒœ์†Œ ๋ถ„์„์€ ์ƒ‰์ธ๋˜์ง€ ์•Š๋Š” ๋‹จ์–ด๊ฐ€ ์žˆ์„ ์ˆ˜๋„ ์žˆ๊ณ , ๋„์›Œ์“ฐ๊ธฐ๊ฐ€ ๋˜์ง€
์•Š์•˜์„ ๋•Œ ๋ถ„์„์ด ํž˜๋“ฌ.
๏ต N๊ทธ๋žจ ๋ฐฉ์‹์€ ์–ธ์–ด์ •๋ณด๋ฅผ ๋ชฐ๋ผ๋„ ๊ฐ€๋Šฅํ•œ ๋ฐฉ๋ฒ•
๏ต๊ทธ ๋Œ€์‹  ๊ตฌ๋ฆด ์ˆ˜ ์žˆ์Œ.
์ƒ‰์ธ #8
๏ตN๊ทธ๋žจ
๏ต 2-gram(Bigram) ์ผ ๊ฒฝ์šฐ โ€œ์„œํ•‘ํด๋Ÿฝโ€์€ โ€œ์„œํ•‘โ€, โ€œํ•‘ํดโ€œ, โ€œํด๋Ÿฝโ€
์˜ ๋‘ ๊ธ€์ž๋กœ ๊ตฌ์„ฑ๋œ 3 ๋‹จ์–ด๋กœ ์ƒ‰์ธ์„ ๊ตฌ์„ฑํ•จ.
๏ต 3-gram(trigram) ์ผ ๊ฒฝ์šฐ โ€œ์„œํ•‘ํด๋Ÿฝโ€ ์€ โ€œ์„œํ•‘ํดโ€, โ€œํ•‘ํด๋Ÿฝโ€ ์˜
์„ธ ๊ธ€์ž๋กœ ๊ตฌ์„ฑ๋œ 2 ๋‹จ์–ด๋กœ ์ƒ‰์ธ์„ ๊ตฌ์„ฑํ•จ.
์ƒ‰์ธ #9 โ€“ ๊ฒ€์ƒ‰
๏ต๊ฒ€์ƒ‰์–ด๋„ ๋™์ผํ•œ ์ž‘์—…์„ ์ง„ํ–‰
๏ตbest ๋Š” ๋ฌธ์„œ2์— ์กด์žฌํ•œ๋‹ค.
๏ตblue ๋Š” ๋ฌธ์„œ1,3์— ์กด์žฌํ•œ๋‹ค.
๏ตblue & sky ๋กœ ๊ฒ€์ƒ‰์‹œ๋Š” ๋ฌธ์„œ 3์— ์กด์žฌํ•œ๋‹ค.
๏ต๋ถˆ์šฉ์–ด๋กœ ๊ฒ€์ƒ‰ํ•˜๋ฉด ๊ฒฐ๊ณผ๊ฐ€ ์•ˆ๋‚˜์˜ด.
์ƒ‰์ธ #10 โ€“ ์งˆ๋ฌธ๋“ค
๏ต๊ทธ๋Ÿผ ์—„์ฒญ ๋งŽ์€ ๋ฌธ์„œ์˜ ์—ญ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€์ง„ ๋…€์„๋“ค์€
์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐํ•ด์•ผ ํ• ๊นŒ์š”?
๏ตAnimal, Apple ๊ฐ™์€ ๋‹จ์–ด๋“ค์€?
๋žญํ‚น #0
๏ต์–ด๋–ป๊ฒŒ ๋ฌธ์„œ์˜ ๋žญํ‚น์„ ๋งค๊ธธ ์ˆ˜ ์žˆ์„๊นŒ?
๏ต์–ด๋–ค ๋ฌธ์„œ๊ฐ€ ์ข‹์€ ๋ฌธ์„œ์ผ๊นŒ์š”?
๋žญํ‚น #1
๏ต์ข‹์€ ๋ฌธ์„œ
๏ต๋‹ค๋ฅธ ๋ฌธ์„œ๋“ค์ด ๋งŽ์ด ๋งํฌํ•˜๊ณ  ์žˆ๊ณ โ€ฆ(PageRank)
๏ตํŠนํžˆ ๋‹ค๋ฅธ ์ข‹์€ ๋ฌธ์„œ๋“ค์ด ๋งํฌ๋ฅผ ํ•œ๋‹ค๋ฉด?
๏ต์ž์ฃผ ์—…๋ฐ์ดํŠธ ๋˜๋ฉด์„œโ€ฆ
๏ต๊ฒ€์ƒ‰์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์“ฐ์ด๋Š”โ€ฆ
๋žญํ‚น #2 โ€“ ์œ„์น˜ ์ •๋ณด
๏ตblue sky ๋ฅผ ๊ฒ€์ƒ‰ํ•œ๋‹ค๋ฉด, ๋ฌธ์„œ1 ๊ณผ ๋ฌธ์„œ 3 ์ค‘์— ๋ญ
๊ฐ€ ๋” ์ ํ•ฉํ•œ ๋ฌธ์„œ์ผ๊นŒ์š”?
๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ
best 2:10 need 3:300
blue 1:100,3:50 retire 2:100
breeze 1:30 search 3:500
bright 1:50,3:55 sky 2:20,3:55
butterfly 1:20 sunlight 3:400
forget 2:40 wind 2:10
great 2:60
hang 1:400
๋žญํ‚น #3
๏ตํŠน์ • ๋‹จ์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์“ฐ์ธ๋‹ค๋Š” ๊ฒƒ์„
์–ด๋–ป๊ฒŒ ์•Œ ์ˆ˜ ์žˆ์„๊นŒ?
TF-IDF
๋žญํ‚น #4
๏ตTF-IDF๋Š”?
๏ตํŠน์ • ๋‹จ์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—๋Š” ๋งŽ์ด ๋‚˜์˜ค๋Š”๋ฐ, ์ „์ฒด ๋ฌธ์„œ๋“ค ์ค‘
์—๋Š” ์ ๊ฒŒ ๋‚˜์˜ค๋ฉด ํ•ด๋‹น ๋ฌธ์„œ์˜ ํ•ต์‹ฌ์–ด์ผ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๋‹ค๋ผ
๊ณ  ํŒ๋‹จํ•˜๋Š” ๊ฒƒ.
๏ตTF: Term Frequency
๏ตํ•œ ๋ฌธ์„œ์—์„œ ๋ช‡ ๋ฒˆ์ด๋‚˜ ํ•ด๋‹น ๋‹จ์–ด๊ฐ€ ๋‚˜์˜ค๋Š”๊ฐ€?
๏ตDF: Document Frequency
๏ต์ „์ฒด ๋ฌธ์„œ์—์„œ ๋ช‡ ๊ฐœ์˜ ๋ฌธ์„œ์—์„œ ๋ฐœ๊ฒฌ์ด ๋˜๋Š”๊ฐ€?
๋žญํ‚น #5
๏ต ์นด๋“œ๋‰ด์Šค๋ผ๋Š” ๋‹จ์–ด๊ฐ€ ์ „์ฒด ๋ฌธ์„œ 10๊ฐœ ์ค‘์— 3๊ฐœ์—์„œ ๋ฐœ๊ฒฌ
๏ต ์˜ค๋Š˜์ด๋ผ๋Š” ๋‹จ์–ด๊ฐ€ ์ „์ฒด ๋ฌธ์„œ 10๊ฐœ ์ค‘์— 9๊ฐœ์—์„œ ๋ฐœ๊ฒฌ
๏ตLog(10/9) = 0.045 ์ž„, ์ฆ‰ ๋งŽ์€ ๋ฌธ์„œ์—์„œ ๋ฐœ๊ฒฌ ๋ ์ˆ˜๋ก ๊ฐ’์ด
์ ์–ด์ง(TF-IDF)์˜ ํŠน์„ฑ
Keyword URL TF TF*IDF
์นด๋“œ๋‰ด์Šค DOC1 5 5 * log(10/3) = 5 * 0.52
DOC2 3 3 * log(10/3) = 3 * 0.52
DOC3 10 10 * log(10/3) = 10 * 0.52
๋žญํ‚น #6 โ€“ ์งˆ๋ฌธ๋“ค
๏ต๊ธฐ๋ณธ์ ์œผ๋กœ TF ์™€ DF์— ์˜ํ–ฅ์„ ๋ฐ›๊ฒŒ ๋˜๋Š”๋ฐโ€ฆ ๊ทธ
๋Ÿผ ๊ฐ™์€ DF ๋ผ๋ฉด, TF๊ฐ€ ๋†’์„ ์ˆ˜๋ก ์ ์ˆ˜๊ฐ€ ๋†’์•„์ง€๋Š”
๋ฐโ€ฆ ๋‹ค์Œ ์ค‘ ์ ์ˆ˜๊ฐ€ ๋†’์€ ๋ฌธ์„œ๋Š”?
๏ต๊ฒ€์ƒ‰ ์—”์ง„
๏ต๊ฒ€์ƒ‰ ์—”์ง„ ๊ฒ€์ƒ‰ ์—”์ง„
๋žญํ‚น #7 โ€“ BM25
๏ต TF-IDF๊ฐ€ ๋ฌธ์„œ์˜ ๊ธธ์ด์— ์˜ํ–ฅ์„ ๋ฐ›์œผ๋ฏ€๋กœ, ๋ฌธ์„œ ๊ธธ์˜ ํ‰
๊ท ์— ์˜ํ–ฅ์„ ๋ฐ›๋„๋ก ๊ฐœ๋Ÿ‰ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜
๏ตElastic Search ์—์„œ ์“ด๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
๋žญํ‚น #8 โ€“ BM25
๏ต IDF = ์ „์ฒด ๋ฌธ์„œ์— ๋งŽ์ด ๋‚˜ํƒ€๋‚ ์ˆ˜๋ก ์ ์€ ๊ฐ’์„ ์คŒ.
๋žญํ‚น #9 โ€“ BM25
๏ต TF = Term Frequency
๏ต IDF = ์ „์ฒด ๋ฌธ์„œ์— ๋งŽ์ด ๋‚˜ํƒ€๋‚ ์ˆ˜๋ก ์ ์€ ๊ฐ’์„ ์คŒ.
๋žญํ‚น #10 โ€“ BM25
๏ต k1, b = ๊ทธ๋ƒฅ ์ •ํ•œ ์ƒ์ˆ˜
๏ตk1 = tf์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜, b = ๋ฌธ์„œ์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜
๏ต |D| = ๋ฌธ์„œ์˜ ๊ธธ์ด
๏ต avgdl = ๋ฌธ์„œ์˜ ํ‰๊ท  ๊ธธ์ด
๏ต ๊ฒฐ๋ก ์ ์œผ๋กœ ํ‰๊ท  ๋ฌธ์„œ๊ธธ์ด ๋ณด๋‹ค ์ž‘์€ ๋ฌธ์„œ์—์„œ ๋งค์นญ๋ ์ˆ˜
๋ก ์ ์ˆ˜๊ฐ€ ๋†’์Œ.
๊ฒฐ๋ก 
๏ต๊ฒ€์ƒ‰์—”์ง„ ๋งŒ๋“ค์–ด ์“ฐ์ง€๋ง๊ณ  ๊ทธ๋ƒฅ ์ž˜ ์”์‹œ๋‹ค.
Thank you.

More Related Content

What's hot

์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
Taehoon Kim
ย 
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
Jeongsang Baek
ย 
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธBigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
Seongyun Byeon
ย 
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
Jaikwang Lee
ย 
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
NAVER D2
ย 
How to build massive service for advance
How to build massive service for advanceHow to build massive service for advance
How to build massive service for advance
DaeMyung Kang
ย 
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1 ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1  ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1  ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1 ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
Yooseok Choi
ย 
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
Taehoon Kim
ย 
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA) ์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
Yongho Ha
ย 
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒEvent source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
beom kyun choi
ย 
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
Dylan Ko
ย 
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
Dana Jeong
ย 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ [NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
Hyojun Jeon
ย 
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
Yongho Ha
ย 
webservice scaling for newbie
webservice scaling for newbiewebservice scaling for newbie
webservice scaling for newbie
DaeMyung Kang
ย 
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
NHN FORWARD
ย 
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
Seung-June Lee
ย 
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
Tapjoy X 5Rocks
ย 
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ตแ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
์ข…๋ฆฝ ์ด
ย 
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€? - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€?  - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€?  - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€? - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
Hui Seo
ย 

What's hot (20)

์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
์นด์นด์˜คํ†ก์œผ๋กœ ์—ฌ์นœ ๋งŒ๋“ค๊ธฐ 2013.06.29
ย 
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
๋กœ๊ทธ ๊ธฐ๊น”๋‚˜๊ฒŒ ์ž˜ ๋””์ž์ธํ•˜๋Š” ๋ฒ•
ย 
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธBigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
BigQuery์˜ ๋ชจ๋“  ๊ฒƒ(๊ธฐํš์ž, ๋งˆ์ผ€ํ„ฐ, ์‹ ์ž… ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋ฅผ ์œ„ํ•œ) ์ž…๋ฌธํŽธ
ย 
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„…แ…ฉแ„€แ…ณแ„‡แ…ฎแ†ซแ„‰แ…ฅแ†จ Bigqueryแ„…แ…ฉ แ„€แ…กแ†ซแ„ƒแ…กแ†ซแ„’แ…ต แ„‰แ…กแ„‹แ…ญแ†ผแ„’แ…กแ„€แ…ต (20170215 T์•„์นด๋ฐ๋ฏธ)
ย 
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
[236] แ„แ…กแ„แ…กแ„‹แ…ฉแ„‹แ…ดแ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„‘แ…กแ„‹แ…ตแ„‘แ…ณแ„…แ…กแ„‹แ…ตแ†ซ แ„‹แ…ฒแ†ซแ„ƒแ…ฉแ„‹แ…งแ†ผ
ย 
How to build massive service for advance
How to build massive service for advanceHow to build massive service for advance
How to build massive service for advance
ย 
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1 ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1  ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1  ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
Bigquery์™€ airflow๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์‹œ์Šคํ…œ ๊ตฌ์ถ• v1 ๋‚˜๋ฌด๊ธฐ์ˆ (์ฃผ) ์ตœ์œ ์„ 20170912
ย 
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
์‰ฝ๊ฒŒ ์“ฐ์—ฌ์ง„ Django
ย 
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA) ์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
์˜ค๋Š˜ ๋ฐค๋ถ€ํ„ฐ ์“ฐ๋Š” google analytics (๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค, GA)
ย 
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒEvent source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
Event source แ„’แ…กแ†จแ„‰แ…ณแ†ธ แ„‚แ…ขแ„‹แ…ญแ†ผ แ„€แ…ฉแ†ผแ„‹แ…ฒ
ย 
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
[์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๋ฒ•] ์ข‹๋‹ค๋Š” ๊ฑด ์•Œ๊ฒ ๋Š”๋ฐ ์ข€ ์จ๋ณด๊ณ  ์‹ถ์†Œ. ๋ฐ์ดํ„ฐ! - ๋„˜๋ฒ„์›์Šค ํ•˜์šฉํ˜ธ ๋Œ€ํ‘œ
ย 
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
แ„ƒแ…ฆแ„‹แ…ตแ„‚แ…กแ„‹แ…ด แ„Žแ…กแ†ทแ„‰แ…ฑแ„‹แ…ฎแ†ซ แ„‹แ…ขแ„ƒแ…ณแ„แ…ฆแ„แ…ณ (20150419)
ย 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ [NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
[NDC18] ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ 
ย 
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
์Šคํƒ€ํŠธ์—…์€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๋ผ๋ด์•ผ ํ• ๊นŒ? (๊ฐœ์ •ํŒ)
ย 
webservice scaling for newbie
webservice scaling for newbiewebservice scaling for newbie
webservice scaling for newbie
ย 
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
[2018] ๊ตฌ์กฐํ™”๋œ ๊ฒ€์ƒ‰ ๋ชจ๋ธ
ย 
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
๊ธˆ์œต ๋ฐ์ดํ„ฐ ์ดํ•ด์™€ ๋ถ„์„ PyCon 2014
ย 
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
[NDC ๋ฐœํ‘œ] ๋ชจ๋ฐ”์ผ ๊ฒŒ์ž„๋ฐ์ดํ„ฐ๋ถ„์„ ๋ฐ ์‹ค์ „ ํ™œ์šฉ
ย 
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ตแ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
แ„‚แ…กแ„‹แ…ด แ„‹แ…ตแ„Œแ…ตแ†จ แ„‹แ…ตแ„‹แ…ฃแ„€แ…ต
ย 
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€? - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€?  - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€?  - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€๋Š” ์–ด๋–ค SKILLSET์„ ๊ฐ€์ ธ์•ผ ํ•˜๋Š”๊ฐ€? - ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋˜๊ธฐ
ย 

Viewers also liked

แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
์ข…๋ฏผ ๊น€
ย 
How to study
How to studyHow to study
How to study
DaeMyung Kang
ย 
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
KwangSeob Jeong
ย 
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
Minsuk Lee
ย 
Redis From 2.8 to 4.x
Redis From 2.8 to 4.xRedis From 2.8 to 4.x
Redis From 2.8 to 4.x
DaeMyung Kang
ย 
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
NAVER D2
ย 

Viewers also liked (6)

แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
แ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„‹แ…ฆแ†ซแ„Œแ…ตแ†ซแ„‹แ…ต แ„ƒแ…ฆแ„‹แ…ตแ„แ…ฅแ„…แ…ณแ†ฏ แ„ƒแ…กแ„…แ…ฎแ„‚แ…ณแ†ซ แ„‡แ…ฅแ†ธ แ„€แ…ตแ†ทแ„Œแ…ฉแ†ผแ„†แ…ตแ†ซ
ย 
How to study
How to studyHow to study
How to study
ย 
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
์•”ํ˜ธํ™” ์ด๊ฒƒ๋งŒ ์•Œ๋ฉด ๋œ๋‹ค.
ย 
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
์ž๊ธฐ์†Œ๊ฐœ์„œ, ์ด๋ ฅ์„œ ์“ฐ๋Š” ๋ฒ•
ย 
Redis From 2.8 to 4.x
Redis From 2.8 to 4.xRedis From 2.8 to 4.x
Redis From 2.8 to 4.x
ย 
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
๋ฐฑ์–ต๊ฐœ์˜ ๋กœ๊ทธ๋ฅผ ๋ชจ์•„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ถ„์„ํ•˜๊ณ  ํ•™์Šต๋„ ์‹œ์ผœ๋ณด์ž : ๋กœ๊ธฐ์Šค
ย 

Similar to Soma search

ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
๊น€์šฉ๋ฒ” | ๋ฌด์˜์ธํ„ฐ๋‚ด์‡ผ๋‚ 
ย 
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
๊น€์šฉ๋ฒ” | ๋ฌด์˜์ธํ„ฐ๋‚ด์‡ผ๋‚ 
ย 
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
limdongjo ์ž„๋™์กฐ
ย 
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
ssuseraf7587
ย 
Recommendation with deep learning
Recommendation with deep learningRecommendation with deep learning
Recommendation with deep learning
kwon soonmok
ย 
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…กแ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
jiyein
ย 
CoreDot TechSeminar 2018 - Session3 Doh Seungheon
CoreDot TechSeminar 2018 - Session3 Doh SeungheonCoreDot TechSeminar 2018 - Session3 Doh Seungheon
CoreDot TechSeminar 2018 - Session3 Doh Seungheon
Core.Today
ย 

Similar to Soma search (7)

ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ๋ถ„์„ - 2์ฐจ
ย 
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด๋ถ„์„ ๊ธฐ์ดˆ
ย 
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
ํ…์ŠคํŠธ ๋งˆ์ด๋‹ ๊ธฐ๋ณธ ์ •๋ฆฌ(๋ง๋ญ‰์น˜, ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ ์ ˆ์ฐจ, TF, IDF ๊ธฐํƒ€)
ย 
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
[Langcon2020]๋กฏ๋ฐ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ์ž๊ธฐ์†Œ๊ฐœ์„œ๋ฅผ ์ฝ๊ณ  ์žˆ์„๊นŒ?
ย 
Recommendation with deep learning
Recommendation with deep learningRecommendation with deep learning
Recommendation with deep learning
ย 
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…กแ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
แ„ƒแ…ตแ„†แ…ตแ„แ…ฅแ†ซ แ„‹แ…ฅแ„…แ…ตแ†ซแ„‹แ…ตแ„แ…ฅแ†ทแ„‘แ…ฒแ„แ…ฅแ„€แ…ญแ„‹แ…ฒแ†จ 9แ„Œแ…ฎแ„Žแ…ก
ย 
CoreDot TechSeminar 2018 - Session3 Doh Seungheon
CoreDot TechSeminar 2018 - Session3 Doh SeungheonCoreDot TechSeminar 2018 - Session3 Doh Seungheon
CoreDot TechSeminar 2018 - Session3 Doh Seungheon
ย 

More from DaeMyung Kang

Count min sketch
Count min sketchCount min sketch
Count min sketch
DaeMyung Kang
ย 
Redis
RedisRedis
Redis
DaeMyung Kang
ย 
Ansible
AnsibleAnsible
Ansible
DaeMyung Kang
ย 
Why GUID is needed
Why GUID is neededWhy GUID is needed
Why GUID is needed
DaeMyung Kang
ย 
How to use redis well
How to use redis wellHow to use redis well
How to use redis well
DaeMyung Kang
ย 
The easiest consistent hashing
The easiest consistent hashingThe easiest consistent hashing
The easiest consistent hashing
DaeMyung Kang
ย 
How to name a cache key
How to name a cache keyHow to name a cache key
How to name a cache key
DaeMyung Kang
ย 
Integration between Filebeat and logstash
Integration between Filebeat and logstash Integration between Filebeat and logstash
Integration between Filebeat and logstash
DaeMyung Kang
ย 
How To Become Better Engineer
How To Become Better EngineerHow To Become Better Engineer
How To Become Better Engineer
DaeMyung Kang
ย 
Kafka timestamp offset_final
Kafka timestamp offset_finalKafka timestamp offset_final
Kafka timestamp offset_final
DaeMyung Kang
ย 
Kafka timestamp offset
Kafka timestamp offsetKafka timestamp offset
Kafka timestamp offset
DaeMyung Kang
ย 
Data pipeline and data lake
Data pipeline and data lakeData pipeline and data lake
Data pipeline and data lake
DaeMyung Kang
ย 
Redis acl
Redis aclRedis acl
Redis acl
DaeMyung Kang
ย 
Coffee store
Coffee storeCoffee store
Coffee store
DaeMyung Kang
ย 
Scalable webservice
Scalable webserviceScalable webservice
Scalable webservice
DaeMyung Kang
ย 
Number system
Number systemNumber system
Number system
DaeMyung Kang
ย 
Internet Scale Service Arichitecture
Internet Scale Service ArichitectureInternet Scale Service Arichitecture
Internet Scale Service Arichitecture
DaeMyung Kang
ย 
Bloomfilter
BloomfilterBloomfilter
Bloomfilter
DaeMyung Kang
ย 
Redis From 2.8 to 4.x(unstable)
Redis From 2.8 to 4.x(unstable)Redis From 2.8 to 4.x(unstable)
Redis From 2.8 to 4.x(unstable)
DaeMyung Kang
ย 
Redis 2017
Redis 2017Redis 2017
Redis 2017
DaeMyung Kang
ย 

More from DaeMyung Kang (20)

Count min sketch
Count min sketchCount min sketch
Count min sketch
ย 
Redis
RedisRedis
Redis
ย 
Ansible
AnsibleAnsible
Ansible
ย 
Why GUID is needed
Why GUID is neededWhy GUID is needed
Why GUID is needed
ย 
How to use redis well
How to use redis wellHow to use redis well
How to use redis well
ย 
The easiest consistent hashing
The easiest consistent hashingThe easiest consistent hashing
The easiest consistent hashing
ย 
How to name a cache key
How to name a cache keyHow to name a cache key
How to name a cache key
ย 
Integration between Filebeat and logstash
Integration between Filebeat and logstash Integration between Filebeat and logstash
Integration between Filebeat and logstash
ย 
How To Become Better Engineer
How To Become Better EngineerHow To Become Better Engineer
How To Become Better Engineer
ย 
Kafka timestamp offset_final
Kafka timestamp offset_finalKafka timestamp offset_final
Kafka timestamp offset_final
ย 
Kafka timestamp offset
Kafka timestamp offsetKafka timestamp offset
Kafka timestamp offset
ย 
Data pipeline and data lake
Data pipeline and data lakeData pipeline and data lake
Data pipeline and data lake
ย 
Redis acl
Redis aclRedis acl
Redis acl
ย 
Coffee store
Coffee storeCoffee store
Coffee store
ย 
Scalable webservice
Scalable webserviceScalable webservice
Scalable webservice
ย 
Number system
Number systemNumber system
Number system
ย 
Internet Scale Service Arichitecture
Internet Scale Service ArichitectureInternet Scale Service Arichitecture
Internet Scale Service Arichitecture
ย 
Bloomfilter
BloomfilterBloomfilter
Bloomfilter
ย 
Redis From 2.8 to 4.x(unstable)
Redis From 2.8 to 4.x(unstable)Redis From 2.8 to 4.x(unstable)
Redis From 2.8 to 4.x(unstable)
ย 
Redis 2017
Redis 2017Redis 2017
Redis 2017
ย 

Soma search

  • 1. ์•„์ฃผ ์‹ฌํ”Œํ•œ ๊ฒ€์ƒ‰์—”์ง„์˜ ์›๋ฆฌ ๊ฐ•๋Œ€๋ช… (CHARSYAM@NAVER.COM)
  • 2. ๊ณ ๋ฐฑ!!! ๏ต์‹ค์ œ๋กœ ๊ฒ€์ƒ‰์—”์ง„ ๊ด€๋ จ ์ผ์„ ํ•ด๋ณธ ๊ฒƒ์€, ํ•™๊ต ์—ฐ๊ตฌ์‹ค ์ด ๊ฒ€์ƒ‰์—”์ง„ ๋งŒ๋“œ๋Š” ์—ฐ๊ตฌ์‹ค์ด๋ผ, ๊ฑฐ๊ธฐ์„œ ์•Œ๋ฐ”๋ฅผ ํ•ด ๋ณธ ๊ฒƒ ๋ฐ–์— ์—†์Šต๋‹ˆ๋‹ค. ๏ต๊ทธ๋ฆฌ๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ์กฐ๊ธˆ ๊ณต๋ถ€ํ•ด๋ณธ ๊ฒƒ๋“ค ๋ฟ์ด์—์š”. ๏ต์ฆ‰ ์ƒ๋‹นํ•œ โ€œ๊ตฌ๋ผโ€ ๊ฐ€ ์„ž์—ฌ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • 4. ์™œ? ๏ต๊ฒ€์ƒ‰์—”์ง„์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ??? ๏ต์–ด๋–ค ์ •๋ณด๋ฅผ Ranking์— ๋งž์ถฐ์„œ ๊ฐ€์ ธ์˜ฌ ํ•„์š”์„ฑ. ๏ตโ€œํŒŒ์ด์ฌโ€&โ€KOREAโ€ ๋ผ๋Š” ๋‹จ์–ด๋ฅผ ๊ฐ€์ง„ ๋ฌธ์„œ๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ์‹ถ๋‹ค๋ฉด?
  • 6. ๊ดœํžˆ ๋งŒ๋“œ๋Š” ๊ฒƒ ๋ณด๋‹ค ์œ„์— ๊ฒƒ๋“ค ์“ฐ์‹œ๋Š” ๊ฒŒ ํ›จ์”ฌ ์ข‹์Šต๋‹ˆ๋‹ค.
  • 7. ๊ทธ๋Ÿฐ๋ฐ ์™œ? ๏ต๊ทธ๋ƒฅ ์žฌ๋ฏธ๋กœโ€ฆ ๏ต์žฌ๋ฏธ๋‚œ ์ž๋ฃŒ๋ฅผ ๋ณด๊ณ  ๋‚˜๋‹ˆ ๋‚˜๋„ ์ •๋ฆฌํ•ด๋ณด๊ณ  ์‹ถ์–ด์„œโ€ฆ
  • 8. ๊น€์ข…๋ฏผ๋‹˜์˜ ๋ฐ๋†€ ๋ฐœํ‘œ ๏ตhttps://www.slideshare.net/kjmorc/ss- 80803233 ๏ต ์ด๊ฒƒ๋งŒ ๋ณด์…”๋„ ๋ฉ๋‹ˆ๋‹ค. ๏ต์ œ๊ป€ โ€œ๊ตฌ๋ผโ€ ๋ฒ„์ „
  • 11. ์งˆ์˜ ๊ณผ์ • - ํ•™์ˆ ์  User Data Store ๋žญํ‚น ํ‰๊ฐ€ User Interaction Index Log Data
  • 12. ๊ฒ€์ƒ‰์—”์ง„์˜ ๊ตฌ์„ฑ ์š”์†Œ ๏ต์ƒ‰์ธ ๊ณผ์ • ๏ตํฌ๋กค๋ง + ์—ญ์ธ๋ฑ์Šค ๊ตฌ์„ฑ ๏ต์งˆ์˜ ๊ณผ์ • ๏ต์ธ๋ฑ์Šค๋กœ ์ฐพ๊ธฐ + ๋žญํ‚น
  • 13. ํฌ๋กค๋ง #0 ๏ต์›น ํŽ˜์ด์ง€์˜ ์ˆ˜์ง‘ ๏ตRequests module ๏ตr = requests.get(โ€˜http://www.naver.comโ€™)
  • 14. ํฌ๋กค๋ง #1 ๏ตํ•ด์•ผํ•  ์งˆ๋ฌธ๋“ค!!! โ€“ ๋น„๊ธฐ์ˆ ์  ๏ต ํฌ๋กค๋ง์„ ํ•ด๋„๋˜๋‚˜์š”? ๏ตrobots.txt ๏ต ๊ตฌ๊ธ€ ๊ฒ€์ƒ‰๋ด‡์ด ์šฐ๋ฆฌ ์„œ๋ฒ„๋ฅผ ๊ณต๊ฒฉํ•ด์š”.
  • 15. ํฌ๋กค๋ง #2 ๏ต Simple Idea Redis List Crawling Loop BLPOP
  • 16. ํฌ๋กค๋ง #3 ๏ตํŽ˜์ด์ง€๋ฅผ ๊ฐ€์ ธ์™”์œผ๋ฉด? ๏ต๋งํฌ ์ถ”์ถœ ๏ตMeaningful params์˜ ์ถ”์ถœ ๏ต์ธ์ฝ”๋”ฉ ๋ณ€๊ฒฝ ๏ตํ…์ŠคํŠธ ์ถ”์ถœ(ํƒœ๊ทธ ์ œ๊ฑฐ)
  • 17. ํฌ๋กค๋ง #4 ๏ต๊ฐ™์€ ํŽ˜์ด์ง€๋ฅผ ์žฌ๋ฐฉ๋ฌธํ•ด์•ผ ํ• ๊นŒ? ๏ต์žฌ๋ฐฉ๋ฌธ ํ•˜์ง€ ์•Š์•„์•ผ ํ•œ๋‹ค๋ฉด? ๏ต์–ด๋–ป๊ฒŒ ๊ธฐ๋ก์„ ํ•ด๋‘˜๊นŒ? ๏ต์žฌ๋ฐฉ๋ฌธ ํ•ด์•ผ ํ•œ๋‹ค๋ฉด? ๋ช‡์ผ๋งˆ๋‹ค?
  • 18. ํฌ๋กค๋ง #5 ๏ต์ €์žฅ์„ ํ•ด์•ผํ•˜๋‚˜? ๏ต์–ด๋””์— ์ €์žฅํ•  ๊ฒƒ์ธ๊ฐ€? ๏ต๋ถ„์‚ฐ ํŒŒ์ผ ์‹œ์Šคํ…œ? ๏ต๊ตฌ๊ธ€์ด ์ด๋Ÿด๋ ค๊ณ  BigTable ๋งŒ๋“ฌ.(Hbase, Cassandra or column oriented storage) ๏ตDB? ๏ต์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•ด์•ผ ํ•˜๋‚˜? ๏ต์›๋ณธ?, ๋ณ€ํ™˜ํ•ด์„œ?
  • 19. ํฌ๋กค๋ง #6 ๏ต Simple Idea Queue Crawling Loop ๋ฐฉ๋ฌธํ•  ์ฃผ์†Œ Storage ๋ฐฉ๋ฌธ ์‹œ๊ฐ„/์ฃผ๊ธฐ ํฌ๋กค๋ง ๋ฐ์ดํ„ฐ
  • 20. ์ƒ‰์ธ #0 ๏ต๋‹ค์Œ ๋ฌธ์„œ๋“ค์„ ์ƒ‰์ธ(Indexing)ํ•œ๋‹ค๋ฉด? The bright blue butterfly hangs on the breeze It's best to forget the great sky and to retire from every wind Under blue sky, in bright sunlight, one need not search around
  • 21. ์ƒ‰์ธ #1 - tokenizing ๏ต๋จผ์ € ๋‹จ์–ด๋ณ„๋กœ ๋‚˜๋ˆˆ๋‹ค. DOC1 The bright blue butterfly hangs on the breeze DOC2 Itโ€™s to best retire to from forget every the wind. great sky and DOC3 Under not blue search sky around. in bright sunlight one need
  • 22. ์ƒ‰์ธ #2 - ๋ณ€ํ™˜ ๏ตํŠน์ˆ˜๋ฌธ์ž ์ œ๊ฑฐ DOC1 The bright blue butterfly hangs on the breeze DOC2 Itโ€™s to best retire to from forget every the wind great sky and DOC3 Under not blue search sky around in bright sunlight one need
  • 23. ์ƒ‰์ธ #3 โ€“ ์—ญ์ธ๋ฑ์Šค ๏ต๋ฌธ์„œ -> ๋‹จ์–ด ์—์„œ ๋‹จ์–ด -> ๋ฌธ์„œ๋กœ ๋ณ€ํ™˜ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ The 1 Itโ€™s 2 to 2 in 3 bright 1 best 2 retire 2 bright 3 blue 1 to 2 from 2 sunlight 3 butterfly 1 forget 2 every 2 one 3 hangs 1 the 2 wind 2 need 3 on 1 great 2 Under 3 not 3 the 1 sky 2 blue 3 search 3 breeze 1 and 2 sky 3 around 3
  • 24. ์ƒ‰์ธ #4 โ€“ ๊ฐ™์€ ๋‹จ์–ด ํ•ฉ์น˜๊ธฐ ๏ต๊ฐ™์€ ๋‹จ์–ด๋ฅผ ํ•ฉ์น˜๊ธฐ ์œ„ํ•ด์„œ ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ? ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ The 1 Itโ€™s 2 to 2 in 3 bright 1 best 2 retire 2 bright 3 blue 1 to 2 from 2 sunlight 3 butterfly 1 forget 2 every 2 one 3 hangs 1 the 2 wind 2 need 3 on 1 great 2 Under 3 not 3 the 1 sky 2 blue 3 search 3 breeze 1 and 2 sky 3 around 3
  • 25. ์ƒ‰์ธ #4-1 โ€“ ๋Œ€์†Œ๋ฌธ์ž ๋ณ€ํ™˜ ๏ต๋Œ€๋ฌธ์ž๋ฅผ ์†Œ๋ฌธ์ž๋กœ. ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ the 1 itโ€™s 2 to 2 in 3 bright 1 best 2 retire 2 bright 3 blue 1 to 2 from 2 sunlight 3 butterfly 1 forget 2 every 2 one 3 hangs 1 the 2 wind 2 need 3 on 1 great 2 under 3 not 3 the 1 sky 2 blue 3 search 3 breeze 1 and 2 sky 3 around 3
  • 26. ์ƒ‰์ธ #4-2 โ€“ ์ •๋ ฌ ๏ต๋‹ค์Œ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด์„œ ์ •๋ ฌ๋„ ํ•˜๊ฒŒ๋จ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ and 2 butterfly 1 need 3 sunlight 3 around 3 every 2 not 3 the 1 best 2 forget 2 on 1 the 1 blue 1 from 2 one 3 the 2 blue 3 great 2 retire 2 to 2 breeze 1 hangs 1 search 3 to 2 bright 1 in 3 sky 2 under 3 bright 3 itโ€™s 2 sky 3 wind 2
  • 27. ์ƒ‰์ธ #4-3 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ ๏ต๋„ˆ๋ฌด ํ”ํ•ด์„œ ์•ˆ ์“ฐ๋Š”๊ฑธ ์ง€์šฐ์ž. ๏ต๊ฒ€์ƒ‰์–ด๋กœ์˜ ๊ฐ€์น˜๊ฐ€ ์—†์Œ a not and on around one every the for to from under in โ€ฆ it โ€ฆ itโ€™s โ€ฆ
  • 28. ์ƒ‰์ธ #4-4 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ ๏ต์•ˆ์“ฐ๋Š” ๋‹จ์–ด ์‚ญ์ œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ butterfly 1 need 3 sunlight 3 best 2 forget 2 blue 1 blue 3 great 2 retire 2 breeze 1 hangs 1 search 3 bright 1 sky 2 bright 3 sky 3 wind 2
  • 29. ์ƒ‰์ธ #4-5 โ€“ ๋ถˆ์šฉ์–ด ์ œ๊ฑฐ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ best 2 great 2 wind 2 blue 1 hangs 1 blue 3 need 3 breeze 1 retire 2 bright 1 search 3 bright 3 sky 2 butterfly 1 sky 3 forget 2 sunlight 3
  • 30. ์ƒ‰์ธ #4-6 โ€“ Stemming ๏ต๋™์‚ฌ๋ฅผ ์›ํ˜•์œผ๋กœ -> ์–ด๊ฐ„/์–ด๋ฏธ๋ฅผ ๋ถ„๋ฆฌํ•ด์„œ ์–ด๊ฐ„๋งŒ ๋‚จ๊ธฐ๋Š”(~s, ~es, ~ed, ~ing ๋“ฑ๋“ฑ๋“ฑ ์ œ๊ฑฐ) ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ best 2 great 2 wind 2 blue 1 hang 1 blue 3 need 3 breeze 1 retire 2 bright 1 search 3 bright 3 sky 2 butterfly 1 sky 3 forget 2 sunlight 3
  • 31. ์ƒ‰์ธ #4-7 โ€“ ํ•ฉ์น˜๊ธฐ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ best 2 great 2 wind 2 blue 1,3 hang 1 need 3 breeze 1 retire 2 bright 1,3 search 3 sky 2,3 butterfly 1 forget 2 sunlight 3
  • 32. ์ƒ‰์ธ #4-8 โ€“ ํ•ฉ์น˜๊ธฐ ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ best 2 need 3 blue 1,3 retire 2 breeze 1 search 3 bright 1,3 sky 2,3 butterfly 1 sunlight 3 forget 2 wind 2 great 2 hang 1
  • 34. ์ƒ‰์ธ #6 ๏ต์˜์–ด๋Š” ์ข€ ์‰ฌ์šด๋ฐ, ํ•œ๊ตญ์–ด๋Š”? ๏ต ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋กœ ๋ถ„ํ•ด๋œ ๋‹จ์–ด๋งŒ ์ €์žฅ ๏ต ์˜คํ”ˆ์†Œ์Šค ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ(python ์ง€์›, ์‚ฌ์ „๊ธฐ๋ฐ˜) ๏ต์€์ „ํ•œ๋‹ข ํ”„๋กœ์ ํŠธ(mecab ๊ธฐ๋ฐ˜) ๏ตKonlpy ๏ตKomoran
  • 35. ์ƒ‰์ธ #7 ๏ตN๊ทธ๋žจ ๏ต ํ˜•ํƒœ์†Œ ๋ถ„์„์€ ์ƒ‰์ธ๋˜์ง€ ์•Š๋Š” ๋‹จ์–ด๊ฐ€ ์žˆ์„ ์ˆ˜๋„ ์žˆ๊ณ , ๋„์›Œ์“ฐ๊ธฐ๊ฐ€ ๋˜์ง€ ์•Š์•˜์„ ๋•Œ ๋ถ„์„์ด ํž˜๋“ฌ. ๏ต N๊ทธ๋žจ ๋ฐฉ์‹์€ ์–ธ์–ด์ •๋ณด๋ฅผ ๋ชฐ๋ผ๋„ ๊ฐ€๋Šฅํ•œ ๋ฐฉ๋ฒ• ๏ต๊ทธ ๋Œ€์‹  ๊ตฌ๋ฆด ์ˆ˜ ์žˆ์Œ.
  • 36. ์ƒ‰์ธ #8 ๏ตN๊ทธ๋žจ ๏ต 2-gram(Bigram) ์ผ ๊ฒฝ์šฐ โ€œ์„œํ•‘ํด๋Ÿฝโ€์€ โ€œ์„œํ•‘โ€, โ€œํ•‘ํดโ€œ, โ€œํด๋Ÿฝโ€ ์˜ ๋‘ ๊ธ€์ž๋กœ ๊ตฌ์„ฑ๋œ 3 ๋‹จ์–ด๋กœ ์ƒ‰์ธ์„ ๊ตฌ์„ฑํ•จ. ๏ต 3-gram(trigram) ์ผ ๊ฒฝ์šฐ โ€œ์„œํ•‘ํด๋Ÿฝโ€ ์€ โ€œ์„œํ•‘ํดโ€, โ€œํ•‘ํด๋Ÿฝโ€ ์˜ ์„ธ ๊ธ€์ž๋กœ ๊ตฌ์„ฑ๋œ 2 ๋‹จ์–ด๋กœ ์ƒ‰์ธ์„ ๊ตฌ์„ฑํ•จ.
  • 37. ์ƒ‰์ธ #9 โ€“ ๊ฒ€์ƒ‰ ๏ต๊ฒ€์ƒ‰์–ด๋„ ๋™์ผํ•œ ์ž‘์—…์„ ์ง„ํ–‰ ๏ตbest ๋Š” ๋ฌธ์„œ2์— ์กด์žฌํ•œ๋‹ค. ๏ตblue ๋Š” ๋ฌธ์„œ1,3์— ์กด์žฌํ•œ๋‹ค. ๏ตblue & sky ๋กœ ๊ฒ€์ƒ‰์‹œ๋Š” ๋ฌธ์„œ 3์— ์กด์žฌํ•œ๋‹ค. ๏ต๋ถˆ์šฉ์–ด๋กœ ๊ฒ€์ƒ‰ํ•˜๋ฉด ๊ฒฐ๊ณผ๊ฐ€ ์•ˆ๋‚˜์˜ด.
  • 38. ์ƒ‰์ธ #10 โ€“ ์งˆ๋ฌธ๋“ค ๏ต๊ทธ๋Ÿผ ์—„์ฒญ ๋งŽ์€ ๋ฌธ์„œ์˜ ์—ญ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€์ง„ ๋…€์„๋“ค์€ ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐํ•ด์•ผ ํ• ๊นŒ์š”? ๏ตAnimal, Apple ๊ฐ™์€ ๋‹จ์–ด๋“ค์€?
  • 39. ๋žญํ‚น #0 ๏ต์–ด๋–ป๊ฒŒ ๋ฌธ์„œ์˜ ๋žญํ‚น์„ ๋งค๊ธธ ์ˆ˜ ์žˆ์„๊นŒ? ๏ต์–ด๋–ค ๋ฌธ์„œ๊ฐ€ ์ข‹์€ ๋ฌธ์„œ์ผ๊นŒ์š”?
  • 40. ๋žญํ‚น #1 ๏ต์ข‹์€ ๋ฌธ์„œ ๏ต๋‹ค๋ฅธ ๋ฌธ์„œ๋“ค์ด ๋งŽ์ด ๋งํฌํ•˜๊ณ  ์žˆ๊ณ โ€ฆ(PageRank) ๏ตํŠนํžˆ ๋‹ค๋ฅธ ์ข‹์€ ๋ฌธ์„œ๋“ค์ด ๋งํฌ๋ฅผ ํ•œ๋‹ค๋ฉด? ๏ต์ž์ฃผ ์—…๋ฐ์ดํŠธ ๋˜๋ฉด์„œโ€ฆ ๏ต๊ฒ€์ƒ‰์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์“ฐ์ด๋Š”โ€ฆ
  • 41. ๋žญํ‚น #2 โ€“ ์œ„์น˜ ์ •๋ณด ๏ตblue sky ๋ฅผ ๊ฒ€์ƒ‰ํ•œ๋‹ค๋ฉด, ๋ฌธ์„œ1 ๊ณผ ๋ฌธ์„œ 3 ์ค‘์— ๋ญ ๊ฐ€ ๋” ์ ํ•ฉํ•œ ๋ฌธ์„œ์ผ๊นŒ์š”? ๋‹จ์–ด ๋ฌธ์„œ ๋‹จ์–ด ๋ฌธ์„œ best 2:10 need 3:300 blue 1:100,3:50 retire 2:100 breeze 1:30 search 3:500 bright 1:50,3:55 sky 2:20,3:55 butterfly 1:20 sunlight 3:400 forget 2:40 wind 2:10 great 2:60 hang 1:400
  • 42. ๋žญํ‚น #3 ๏ตํŠน์ • ๋‹จ์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์“ฐ์ธ๋‹ค๋Š” ๊ฒƒ์„ ์–ด๋–ป๊ฒŒ ์•Œ ์ˆ˜ ์žˆ์„๊นŒ? TF-IDF
  • 43. ๋žญํ‚น #4 ๏ตTF-IDF๋Š”? ๏ตํŠน์ • ๋‹จ์–ด๊ฐ€ ํ•ด๋‹น ๋ฌธ์„œ์—๋Š” ๋งŽ์ด ๋‚˜์˜ค๋Š”๋ฐ, ์ „์ฒด ๋ฌธ์„œ๋“ค ์ค‘ ์—๋Š” ์ ๊ฒŒ ๋‚˜์˜ค๋ฉด ํ•ด๋‹น ๋ฌธ์„œ์˜ ํ•ต์‹ฌ์–ด์ผ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๋‹ค๋ผ ๊ณ  ํŒ๋‹จํ•˜๋Š” ๊ฒƒ. ๏ตTF: Term Frequency ๏ตํ•œ ๋ฌธ์„œ์—์„œ ๋ช‡ ๋ฒˆ์ด๋‚˜ ํ•ด๋‹น ๋‹จ์–ด๊ฐ€ ๋‚˜์˜ค๋Š”๊ฐ€? ๏ตDF: Document Frequency ๏ต์ „์ฒด ๋ฌธ์„œ์—์„œ ๋ช‡ ๊ฐœ์˜ ๋ฌธ์„œ์—์„œ ๋ฐœ๊ฒฌ์ด ๋˜๋Š”๊ฐ€?
  • 44. ๋žญํ‚น #5 ๏ต ์นด๋“œ๋‰ด์Šค๋ผ๋Š” ๋‹จ์–ด๊ฐ€ ์ „์ฒด ๋ฌธ์„œ 10๊ฐœ ์ค‘์— 3๊ฐœ์—์„œ ๋ฐœ๊ฒฌ ๏ต ์˜ค๋Š˜์ด๋ผ๋Š” ๋‹จ์–ด๊ฐ€ ์ „์ฒด ๋ฌธ์„œ 10๊ฐœ ์ค‘์— 9๊ฐœ์—์„œ ๋ฐœ๊ฒฌ ๏ตLog(10/9) = 0.045 ์ž„, ์ฆ‰ ๋งŽ์€ ๋ฌธ์„œ์—์„œ ๋ฐœ๊ฒฌ ๋ ์ˆ˜๋ก ๊ฐ’์ด ์ ์–ด์ง(TF-IDF)์˜ ํŠน์„ฑ Keyword URL TF TF*IDF ์นด๋“œ๋‰ด์Šค DOC1 5 5 * log(10/3) = 5 * 0.52 DOC2 3 3 * log(10/3) = 3 * 0.52 DOC3 10 10 * log(10/3) = 10 * 0.52
  • 45. ๋žญํ‚น #6 โ€“ ์งˆ๋ฌธ๋“ค ๏ต๊ธฐ๋ณธ์ ์œผ๋กœ TF ์™€ DF์— ์˜ํ–ฅ์„ ๋ฐ›๊ฒŒ ๋˜๋Š”๋ฐโ€ฆ ๊ทธ ๋Ÿผ ๊ฐ™์€ DF ๋ผ๋ฉด, TF๊ฐ€ ๋†’์„ ์ˆ˜๋ก ์ ์ˆ˜๊ฐ€ ๋†’์•„์ง€๋Š” ๋ฐโ€ฆ ๋‹ค์Œ ์ค‘ ์ ์ˆ˜๊ฐ€ ๋†’์€ ๋ฌธ์„œ๋Š”? ๏ต๊ฒ€์ƒ‰ ์—”์ง„ ๏ต๊ฒ€์ƒ‰ ์—”์ง„ ๊ฒ€์ƒ‰ ์—”์ง„
  • 46. ๋žญํ‚น #7 โ€“ BM25 ๏ต TF-IDF๊ฐ€ ๋ฌธ์„œ์˜ ๊ธธ์ด์— ์˜ํ–ฅ์„ ๋ฐ›์œผ๋ฏ€๋กœ, ๋ฌธ์„œ ๊ธธ์˜ ํ‰ ๊ท ์— ์˜ํ–ฅ์„ ๋ฐ›๋„๋ก ๊ฐœ๋Ÿ‰ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๏ตElastic Search ์—์„œ ์“ด๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
  • 47. ๋žญํ‚น #8 โ€“ BM25 ๏ต IDF = ์ „์ฒด ๋ฌธ์„œ์— ๋งŽ์ด ๋‚˜ํƒ€๋‚ ์ˆ˜๋ก ์ ์€ ๊ฐ’์„ ์คŒ.
  • 48. ๋žญํ‚น #9 โ€“ BM25 ๏ต TF = Term Frequency ๏ต IDF = ์ „์ฒด ๋ฌธ์„œ์— ๋งŽ์ด ๋‚˜ํƒ€๋‚ ์ˆ˜๋ก ์ ์€ ๊ฐ’์„ ์คŒ.
  • 49. ๋žญํ‚น #10 โ€“ BM25 ๏ต k1, b = ๊ทธ๋ƒฅ ์ •ํ•œ ์ƒ์ˆ˜ ๏ตk1 = tf์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜, b = ๋ฌธ์„œ์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜ ๏ต |D| = ๋ฌธ์„œ์˜ ๊ธธ์ด ๏ต avgdl = ๋ฌธ์„œ์˜ ํ‰๊ท  ๊ธธ์ด ๏ต ๊ฒฐ๋ก ์ ์œผ๋กœ ํ‰๊ท  ๋ฌธ์„œ๊ธธ์ด ๋ณด๋‹ค ์ž‘์€ ๋ฌธ์„œ์—์„œ ๋งค์นญ๋ ์ˆ˜ ๋ก ์ ์ˆ˜๊ฐ€ ๋†’์Œ.