SlideShare a Scribd company logo
1 of 27
Download to read offline
์ œ 4์žฅ
๋น›์˜ ์†๋„๋ก โ€จ
XML ํŒŒ์‹ฑํ•˜๊ธฐ
๊น€๊ฒฝ๋ ฌ
XML ํŒŒ์‹ฑ ๋ชจํ˜•.
โ– SAX(Simple API for XML) - ์ŠคํŠธ๋ฆผ๊ณผ โ€˜ํƒœ๊ทธ ์‹œ์ž‘, ๋, ๋ฌธ
์ž์ž๋ฃŒโ€™๋“ฑ์˜ ์ฝœ๋ฐฑ์œผ๋กœ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ.!
โ– Pull Parsing - SAX ์œ ์‚ฌ, iterator ๊ฐ์ฒด๋ฅผ ํ†ตํ•ด ์ œ์–ด.!
โ– DOM(Document Object Model) - ์ž…๋ ฅ์„ ๋ฌธ์„œ ๊ฐ์ฒด๋กœ ๋ณ€
ํ™˜ํ›„ ์ฒ˜๋ฆฌ.
pugixml DOM ํŒŒ์„œ
โ– ๋ฉ”๋ชจ๋ฆฌ ์•ˆ์— ๋“ค์–ด๊ฐˆ ์ •๋„๋กœ ์ž‘์€ ๋ฌธ์„œ.!
โ– ๋ฐฉ๋ฌธํ•  ๋…ธ๋“œ๋“ค์ด ์„œ๋กœ ์ฐธ์กฐํ•˜๋Š” ๋ณต์žกํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„ ๋ฌธ์„œ!
โ– ๋ณต์žกํ•œ ๋ฐฉ์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋Š” ๋ฌธ์„œ.
pugixml ์„ค๊ณ„์ƒ์˜ ์„ ํƒ
โ– ์•„์ฃผ ๋น ๋ฅด๊ณ  ๊ฐ€๋ฒผ์šด DOM ๊ธฐ๋ฐ˜ XML ์กฐ์ž‘ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ
๋ชฉํ‘œ๋กœ ๊ฐœ๋ฐœ.!
โ– ์„ฑ๋Šฅ๊ณผ XML ๊ฒ€์ฆ์˜ ์ ˆ์ถฉ์ .!
โ– well-formed ๊ฒ€์ฆ.!
โ– DTD(Document Type Declaration)์€ ๊ฒ€์ฆ ์•Šํ•จ.!
โ– ์ข…์ข… well-formed ๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ๋„ ์„ฑ๊ณต์œผ๋กœ ์ฒ˜๋ฆฌ.
ํŒŒ์‹ฑ
โ– ํ† ํฐ ์ŠคํŠธ๋ฆผ ๋Œ€์‹  ๋ฌธ์ž ์ŠคํŠธ๋ฆผ์— ๋Œ€ํ•ด ํŒŒ์‹ฑ์„ ์ˆ˜ํ–‰.!
โ– UTF-8 ๋ฌธ์ž๋งŒ ์ง€์›.!
โ– ์ œ์ž๋ฆฌ ํŒŒ์‹ฑ(In-place parsing) - ์ŠคํŠธ๋ฆผ์— ์žˆ๋Š” ์ž๋ฃŒ๋ฅผ ์ง
์ ‘ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ์‹.โ€จ
๋ฌธ์ž์—ด ๋ณต์‚ฌ๋ฅผ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด.
In-place parsing
โ– ๋ฌธ์ž์—ด์„ ๋งŒ๋‚˜๋ฉด ๊ทธ ๋ฌธ์ž์—ด์˜ ํฌ์ธํ„ฐ์™€ ๊ธธ์ด๋ฅผ ์ €์žฅ.!
โ– ์„ฑ๋Šฅ ์ข‹์•„์ง€์ง€๋งŒ, ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์€ ์ฆ๊ฐ€.โ€จ
- ์›๋ณธ ์ŠคํŠธ๋ฆผ ์œ ์ง€.
< n > T h e n o d e t e x t < / n >
ํฌ์ธํ„ฐ0xabc3, ๊ธธ์ด 130xabc0
In-place parsing - ๋„๋ฌธ์ž ์ฒ˜๋ฆฌ
โ– ๋ฌธ์ž์—ด ์ ‘๊ทผ์„ ๋ณด๋‹ค ๋น ๋ฅด๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋„๋ฌธ์ž๋ฅผ ์‚ฝ์ž….!
โ– XML์€ ๋ฌธ์ž์—ด ๋ ๋‹ค์Œ ๋ฌธ์ž๋Š” < ๊ธฐํ˜ธ๋กœ ๊ตฌ๋ถ„๋จ.
< n > T h e n o d e t e x t 0 / n >
ํฌ์ธํ„ฐ0xabc3, ๊ธธ์ด 130xabc0
In-place parsing - ๋ฌธ์ž ํ‘œํ˜„ ์ฒ˜๋ฆฌ
โ– ๋ฌธ์ž์—ด์ด ํ‘œํ˜„๊ณผ ๋‹ค๋ฅธ ๊ฒฝ์šฐ ํŒŒ์‹ฑ์ค‘ ์ฒ˜๋ฆฌ.!
โ– `line1xDxAline2xDline3xAxA` ์„โ€จ
line1xAline2xAline3xAxA` ๋กœ ๋ณ€ํ™˜.!
โ– ๋ฌธ์ž ์ฐธ์กฐ ํ™•์žฅ - &#97; ์„ a ๋กœ ๋ณ€ํ™˜.!
โ– ๊ฐœ์ฒด ์ฐธ์กฐ ํ™•์žฅ - &lt; (<), &gt;(>), &quot;(โ€œ), &apos;(โ€˜);
&amp;(&)!
โ– ํŠน์„ฑ ๊ฐ’ ์ •๊ทœํ™”(Attribute-value normalization) - ๋ชจ๋“  ๊ณต
๋ฐฑ๋ฌธ์ž๋ฅผ ๋นˆ์นธ์œผ๋กœ ๋ณ€ํ™˜.
In-place parsing - ๋ฌธ์ž ํ‘œํ˜„ ์ฒ˜๋ฆฌ
โ– ๋ณ€ํ™˜ ๋•Œ๋ฌธ์— ๋ฌผ์ž์—ด์ด ๋” ๊ธธ์–ด์ ธ์„œ๋Š” ์•ˆ๋œ๋‹ค.!
โ– ๋ณ€ํ™˜ ๊ฒฐ๊ณผ๊ฐ€ ๋” ๊ธธ๋ฉด ๋ฌธ์„œ ์ž๋ฃŒ๋ฅผ ๋ฎ์–ด ์“ธ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ.
< n > A & # 3 2 ; & l t ; B . < / n >0xabc0
< n > A < B . 0 l t ; B . < / n >0xabc0
In-place parsing - Copy-on-Write
โ– memory-mapped ๏ฌle I/O ์„ ์‚ฌ์šฉ.!
โ– ๋„์ข…๋ฃŒ์™€ ํ…์ŠคํŠธ ๋ณ€ํ™˜์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด Copy-on-Write ๋ฐฉ์‹์„ ์ 
์šฉํ•˜์—ฌ ์›๋ณธ ํŒŒ์ผ์ด ๋ณ€๊ฒฝ๋˜๋Š” ๊ฒƒ์„ ๋ง‰์Œ.!
โ– ํ”„๋กœ์„ธ์Šค ์ฃผ์†Œ ๊ณต๊ฐ„์— ์ง์ ‘ ๋Œ€์‘ ์‹œํ‚ค๋ฏ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ๋ณต์‚ฌ๋ฅผ ํ”ผํ• 
์ˆ˜ ์žˆ์Œ.!
โ– ํŒŒ์ผ์ด ์บ์‹œ๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์ปค๋„์ด ๋กœ๋”ฉํ•˜๋ฏ€๋กœ ์ž…์ถœ๋ ฅ๊ณผ ํŒŒ์‹ฑ์ด
๋ณ‘๋ ฌ์ ์œผ๋กœ ์ง„ํ–‰.!
โ– ์ˆ˜์ •๋œ ํŽ˜์ด์ง€๋งŒ ๋ฌผ๋ฆฌ์  ๋ฉ”๋ชจ๋ฆฌ์— ํ• ๋‹น๋˜๋ฏ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ์†Œ๋น„๋ฅผ ์ค„
์ผ ์ˆ˜ ์žˆ์Œ.
๋ฌธ์ž๋ณ„ ์—ฐ์‚ฐ์˜ ์ตœ์ ํ™”โ€จ
Optimizing character-wise operations
โ– ๋ฌธ์ž ํ•˜๋‚˜์— ์†Œ๋น„๋œ ํ‰๊ท  ํ”„๋กœ์„ธ์„œ ์ฃผ๊ธฐ(cycle) ์ˆ˜์ด๋‹ค.!
โ– ๋ฌธ์ž ์ง‘ํ•ฉ ์†Œ์† ์—ฌ๋ถ€ ๊ฒ€์ถœโ€จ
ํ•œ ๋ฌธ์ž๊ฐ€ ์–ด๋–ค ๋ฌธ์ž ์ง‘ํ•ฉ์— ์†ํ•˜๋Š”์ง€ ํŒ์ •ํ•˜๋Š” ๊ฒƒ.
! enum chartype_t!
! {!
! ! ct_parse_pcdata = 1,! // 0, &, r, <!
! ! ct_parse_attr = 2,!! // 0, &, r, ', "!
! ! ct_parse_attr_ws = 4,! // 0, &, r, ', ", n, tab!
! ! ct_space = 8,! ! ! // r, n, space, tab!
! ! ct_parse_cdata = 16,! // 0, ], >, r!
! ! ct_parse_comment = 32,!// 0, -, >, r!
! ! ct_symbol = 64,! ! ! // Any symbol > 127, a-z, A-Z, 0-9, _, :, -, .!
! ! ct_start_symbol = 128! // Any symbol > 127, a-z, A-Z, _, :!
! };!
!
! static const unsigned char chartype_table[256] =!
! {!
! ! 55, 0, 0, 0, 0, 0, 0, 0, 0, 12, 12, 0, 0, 63, 0, 0, // 0-15!
! ! 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 16-31!
! ! 8, 0, 6, 0, 0, 0, 7, 6, 0, 0, 0, 0, 0, 96, 64, 0, // 32-47!
! ! 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 192, 0, 1, 0, 48, 0, // 48-63!
! ! 0, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 64-79!
! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 0, 0, 16, 0, 192, // 80-95!
! ! 0, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 96-111!
! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 0, 0, 0, 0, 0, // 112-127!
!
! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 128+!
! ! โ€ฆ โ€ฆ โ€ฆ!
! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192!
! };!
!
! bool ischartype_utf8(char c, chartype_t ct){!
! ! return ct & chartype_table[(unsigned char)c];!
! }!
ํŠน์ • ๊ตฌ๊ฐ„์˜ ๋ชจ๋“  ๋ฌธ์ž
โ– ์ฃผ์–ด์ง„ ๋ฌธ์ž๊ฐ€ ์ˆซ์ž์ธ์ง€ ํŒ์ •ํ•˜๋Š” ํ•จ์ˆ˜!
โ– bool isdigit(char ch) { return (ch >= '0' && ch <= '9'); }!
โ– bool isdigit(char ch) { return (unsigned)(ch - '0') < 10; }
UTF-8 ๋ฐ”์ดํŠธ์—ด
โ– ์—ฐ์†๋œ 4 ๋ฐ”์ดํŠธ๊ฐ€ ASCII ๊ธฐํ˜ธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” UTF-8 ๋ฐ”์ดํŠธ
์—ด์ธ์ง€ ํŒ๋ณ„ํ•˜๋Š” ์ฝ”๋“œ.!
โ– (*(const uint32_t*)data & 0x80808080) == 0
ํ‘œ์ค€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ is*() ํ•จ์ˆ˜
โ– ์„ฑ๋Šฅ์ด ์ค‘์š”ํ•œ ์ฝ”๋“œ์—์„œ๋Š” isalpha()๋“ฑ์„ ํ”ผํ•ด์•ผ ํ•จ.!
โ– locale ์ด โ€œCโ€ ์ธ์ง€ ์ ๊ฒ€ํ•˜๋Š” ๊ณผ์ •๋•Œ๋ฌธ.
๋ฌธ์ž์—ด ๋ณ€ํ™˜์˜ ์ตœ์ ํ™”
Optimizing string transformations
โ– ๋ฌธ์ž์—ด ๊ฐ’์„ ์ฝ๊ณ  ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •์—์„œ ์‹œ๊ฐ„ ์†Œ๋น„๊ฐ€ ํฌ๋‹ค.!
โ– A&#32;&lt; B.!
โ– A < B.
PCDATA ํŒŒ์‹ฑ ํ•จ์ˆ˜
โ– bool ํ”Œ๋ž˜๊ทธ 2๊ฐœ -> 4๊ฐœ ๋ณ€ํ˜•.!
โ– ๋ฌธ์ž๋“ค์€ ๋ฌธ์ž ์ง‘ํ•ฉ ํŒ์ • ์ด
์šฉ.
template <bool opt_eol, bool opt_escape> struct!
! strconv_pcdata_impl {!
! static char_t* parse(char_t* s) {!
! gap g;!
! while (true) {!
! while (!PUGI__IS_CHARTYPE(*s, ct_parse_pcdata)) ++s;!
! if (*s == '<') { // PCDATA ends here!
! *g.flush(s) = 0;!
! return s + 1;!
! } else if (opt_eol && *s == 'r') { // 0x0d or 0x0d 0x0a pair!
! *s++ = 'n'; // replace first one with 0x0a!
! if (*s == 'n') g.push(s, 1);!
! } else if (opt_escape && *s == '&') {!
! s = strconv_escape(/s, g);!
! } else if (*s == 0) {!
! return s;!
! } else {!
! ++s;!
! }!
! }!
! }!
! };!
PCDATA ํ‹ˆ(GAP) ๊ด€๋ฆฌ
โ– &quot; ๋ฅผ โ€œ ๋กœ ๋Œ€์ฒดํ•˜๋ฉด ๋ฌธ์ž ๋‹ค์„ฏ๊ฐœ์˜ ํ‹ˆ์ด ์ƒ๊น€.!
โ– ๋‘ ํ‹ˆ์„ ๋ณ‘ํ•ฉ - ๊ธฐ์กด ํ‹ˆ๊ณผ ์ƒˆ ํ‹ˆ ์‚ฌ์ด ์ž๋ฃŒ๋ฅผ ์•ž์œผ๋กœ ์˜ฎ๊น€.!
โ– ์ฝ๊ธฐ/์“ฐ๊ธฐ ํฌ์ธํ„ฐ๋ณด๋‹ค ์ข€ ๋” ๋น ๋ฅด๊ฒŒ ๋ณ‘ํ•ฉ(memmove)
์ œ์–ด ํ๋ฆ„์˜ ์ตœ์ ํ™”
Optimizing control flow
โ– ์žฌ๊ท€์  ํ•˜๊ฐ• ํŒŒ์„œ(recursive-descent parser) ํ˜•ํƒœ์—์„œ ์„ฑ
๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ์žฌ๊ท€๋ฅผ ๋ฐ˜๋ณต ๋ฃจํ”„๋กœ ๋ณ€๊ฒฝ.!
โ– ๋…ธ๋“œ ์ปค์„œ๋Š” ์Šคํƒ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘.!
โ– ์Šคํƒ ๊ณต๊ฐ„ ์†Œ๋น„๋Ÿ‰์ด ์ž…๋ ฅ ๋ฌธ์„œ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์ผ์ •.!
โ– ์•ˆ์ •์„ฑ์„ ์ฆ๊ฐ€.!
โ– ์ž ์žฌ์ ์ธ ๋น„์‹ผ ํ•จ์ˆ˜ ํ˜ธ์ถœ์„ ํ”ผํ•จ. ???
๋ถ„๊ธฐ ์ˆœ์„œ์™€ ์ฝ”๋“œ ๊ตญ์†Œ์„ฑ
โ– ์ž์ฃผ ์‹คํ–‰๋˜๋Š” ๋ถ€๋ถ„(ํƒœ๊ทธ์ด๋ฆ„,์†์„ฑ)๊ณผ ๊ฑฐ์˜ ์‹คํ–‰๋˜์ง€ ์•Š
๋Š” ๋ถ€๋ถ„(DOCTYPE)!
โ– ์ฒ˜๋ฆฌ ํ™•๋ฅ  - โ€˜<โ€˜ ๋ฌธ์ž ๋‹ค์Œ โ€˜ํƒœ๊ทธ ์ด๋ฆ„โ€™, โ€˜/โ€˜, โ€˜!โ€™, โ€˜?โ€™ ์ˆœ์˜ ํ™•๋ฅ 
๋กœ ๋‚˜ํƒ€๋‚จ.!
โ– ์ฝ”๋“œ ์กฐ๊ฐ๋“ค์˜ ํ™•๋ฅ ์— ๋”ฐ๋ผ ์žฌ๋ฐฐ์น˜!
โ– ์ธ๋ผ์ธ ์ฝ”๋“œ๋Ÿ‰์„ ์ œํ•œ.
โ– ์กฐ๊ฑด ๋ถ„๊ธฐ๋“ค์„ ํ™•๋ฅ ์ด ๋†’์€ ๊ฑฐ์—์„œ ๋‚ฎ์€ ๊ฒƒ ์ˆœ์„œ๋กœ ์žฌ๋ฐฐ์น˜.!
โ– ํ‰๊ท ์ ์ธ ์กฐ๊ฑด ํŒ์ • ๋ฐ ๋ถ„๊ธฐ ์ˆ˜ํ–‰ ํšŸ์ˆ˜๊ฐ€ ์ตœ์†Œํ™”.
! if (data[0] == '<')!
! {!
! if (data[1] == '!') { ... }!
! else if (data[1] == '/') { ... }!
! else if (data[1] == '?') { ... }!
! else { /* start-tag or unrecognized tag */ }!
! }!
!
! if (data[0] == '<')!
! {!
! if (PUGI__IS_CHARTYPE(data[1], ct_start_symbol)) { /* start-tag */ }!
! else if (data[1] == '/') { ... }!
! else if (data[1] == '!') { ... }!
! else if (data[1] == '?') { ... }!
! else { /* unrecognized tag */ }!
! }!
๋ฉ”๋ชจ๋ฆฌ ์•ˆ์ •์„ฑ ๋ณด์žฅ
โ– ๋ฒ„ํผ ์˜ค๋ฒ„ํ”Œ๋กœ์šฐ๋ฅผ ๋ง‰๊ธฐ ์œ„ํ•ด.!
โ– ํ˜„์žฌ ์ฝ๊ธฐ์™€ ๋ฒ„ํผ ๋์„ ๋น„๊ต.!
โ– ํ•˜๋‚˜์˜ ๋ ˆ์ง€์Šคํ„ฐ๊ฐ€ ๋” ํ•„์š”.!
โ– ํ•จ์ˆ˜ ํ˜ธ์ถœ์‹œ ํ˜„์žฌ ์œ„์น˜์™€ ๋์œ„์น˜๋ฅผ ์ „๋‹ฌํ•  ํฌ์ธํ„ฐ ํ•„์š”!
โ– ๋„๋ฌธ์ž ์ฒ˜๋ฆฌ.!
โ– ์ž…๋ ฅ ๋ฒ„ํผ์™€ ๋ฒ„ํผ ํฌ๊ธฐ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๊ฒฝ์šฐ ์‚ฌ์šฉ์˜ ๋ถˆํŽธํ•จ
๋ฐœ์ƒ.
DOM ์ž๋ฃŒ ๊ตฌ์กฐ
โ– ์—ฐ๊ฒฐ ๋ชฉ๋ก ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜
๋Š” ๋…ธ๋“œ ์ˆ˜์ •์€ O(1).!
โ– ์—ฐ๊ฒฐ ๋ชฉ๋ก ์ ‘๊ทผ ๋ฐฉ์‹- ๊ณ ์ • ํฌ๊ธฐ ํ• 
๋‹น์„ ์œ„ํ•œ ๋น ๋ฅธ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น์ž๋ฅผ ์„ค
๊ณ„ํ•˜๋Š” ๊ฒƒ์ด ์ž„์˜ ํฌ๊ธฐ ํ• ๋‹น์ž๋ณด๋‹ค
์‰ฝ๋‹ค.!
โ– ๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ - last_child ์ œ๊ฑฐ,
prev_sibling_cyclic ์œผ๋กœ ๋Œ€์ฒดO(1).
struct Node {!
Node* first_child;!
Node* last_child;!
Node* prev_sibling;!
Node* next_sibling;!
};!
struct Node {!
Node* first_child;!
Node* prev_sibling_cyclic;!
Node* next_sibling;!
};!
Node* last_child(Node* node) {!
return (node->first_child) ?!
node->first_child->prev_sibling_cyclic : NULL;!
}!
!
Node* prev_sibling(Node* node) {!
return (/node->prev_sibling_cyclic->next_sibling) ?!
node->prev_sibling_cyclic : NULL;!
}!
์Šคํƒ ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น
โ– ๊ฐ€๋ณ€ ํฌ๊ธฐ ๋ฌธ์ž์—ด ํ• ๋‹น.!
โ– ํ• ๋‹น ๊ตญ์†Œ์„ฑ์„ ์œ ์ง€.!
โ– DOM ํŒŒ๊ดด ์†๋„์„ ์œ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ.
์Šคํƒ ํ• ๋‹น์ž
const size_t allocator_page_size = 32768;!
struct allocator_page {!
allocator_page* next_page;!
size_t offset;!
char data[allocator_page_size];!
};!
struct allocator_state {!
allocator_page* current;!
};!
!
void* allocate_new_page_data(size_t size) {!
size_t extra_size = (size >
allocator_page_size) ?!
size - allocator_page_size : 0;!
return malloc(sizeof(allocator_page) +
extra_size);!
}
void* allocate_oob(allocator_state* state,
size_t size) {!
allocator_page* page =
(allocator_page*)allocate_new_page_data(siz
e);!
// add page to page list!
page->next_page = state->current;!
state->current = page;!
// user data is located at the beginning
of the page!
page->offset = size;!
return page->data;!
}!
!
void* allocate(allocator_state* state,
size_t size) {!
if (state->current->offset + size <=
allocator_page_size) {!
void* result = state->current->data +
state->current->offset;!
state->current->offset += size;!
return result;!
}!
return allocate_oob(state, size);!
}!
์Šคํƒ ๊ธฐ๋ฐ˜ ํ• ๋‹น์ž์˜ ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ ์ง€์›
โ– ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ์™€ ์žฌ์‚ฌ์šฉ์„ ์œ„ํ•ด ์ฐธ์กฐ ์นด์šดํŠธ ๋ฐฉ์‹ ๋„์ž….!
โ– ๋ชจ๋“  ํŽ˜์ด์ง€๋Š” 32๋ฐ”์ดํŠธ ๊ฒฝ๊ณ„๋กœ ์ •๋ ฌ๋˜๊ณ  ๋ชจ๋“  ํŽ˜์ด์ง€ ํฌ์ธํ„ฐ์˜
ํ•˜์œ„ ๋‹ค์„ฏ ๋น„ํŠธ๋Š” ํ•ญ์ƒ 0์ด๋‹ค. ???!
โ– 5๋น„ํŠธ์— XML ๋ฉ”ํƒ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅ.!
โ– ํ• ๋‹น๋œ ์š”์†Œ์˜ ์œ„์น˜๋ฅผ ํŽ˜์ด์ง€ ์‹œ์ž‘ ์œ„์น˜๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์˜คํ”„์…‹์œผ๋กœ
์ €์žฅ.!
โ– ํŽ˜์ด์ง€ ํฌ์ธํ„ฐ์˜ ์ฃผ์†Œ => โ€จ
(allocator_page*)((char*)(object) -object->offset -
offsetof(allocator_page, data))

More Related Content

Similar to Ch4 pugixml

Programming skills 1๋ถ€
Programming skills 1๋ถ€Programming skills 1๋ถ€
Programming skills 1๋ถ€JiHyung Lee
ย 
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ์œ ์„ ๋‚จ
ย 
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)Choonghyun Yang
ย 
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐchangehee lee
ย 
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)Sang Don Kim
ย 
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐChris Ohk
ย 
[์ „ํŒŒ๊ต์œก] css day 2014
[์ „ํŒŒ๊ต์œก] css day 2014[์ „ํŒŒ๊ต์œก] css day 2014
[์ „ํŒŒ๊ต์œก] css day 2014Kyoung Hwan Min
ย 
Daejeon IT Developer Conference Hibernate3
Daejeon IT Developer Conference Hibernate3Daejeon IT Developer Conference Hibernate3
Daejeon IT Developer Conference Hibernate3plusperson
ย 

Similar to Ch4 pugixml (8)

Programming skills 1๋ถ€
Programming skills 1๋ถ€Programming skills 1๋ถ€
Programming skills 1๋ถ€
ย 
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ
04์žฅ แ„€แ…ฉแ„€แ…ณแ†ธแ„‡แ…งแ†ซแ„‰แ…ฎ แ„‰แ…กแ„‹แ…ญแ†ผ
ย 
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)
Http ์™„๋ฒฝ ๊ฐ€์ด๋“œ(2์žฅ url๊ณผ ๋ฆฌ์†Œ์Šค)
ย 
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ
[Kgc2013] ๋ชจ๋ฐ”์ผ ์—”์ง„ ๊ฐœ๋ฐœ๊ธฐ
ย 
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)
[Td 2015]๋…น์Šจ c++ ์ฝ”๋“œ์— ๋ชจ๋˜ c++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ(์˜ฅ์ฐฌํ˜ธ)
ย 
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ
[TechDays Korea 2015] ๋…น์Šจ C++ ์ฝ”๋“œ์— ๋ชจ๋˜ C++๋กœ ๊ธฐ๋ฆ„์น ํ•˜๊ธฐ
ย 
[์ „ํŒŒ๊ต์œก] css day 2014
[์ „ํŒŒ๊ต์œก] css day 2014[์ „ํŒŒ๊ต์œก] css day 2014
[์ „ํŒŒ๊ต์œก] css day 2014
ย 
Daejeon IT Developer Conference Hibernate3
Daejeon IT Developer Conference Hibernate3Daejeon IT Developer Conference Hibernate3
Daejeon IT Developer Conference Hibernate3
ย 

More from Kyungryul KIM

11.scripting
11.scripting11.scripting
11.scriptingKyungryul KIM
ย 
32 osx app_release
32 osx app_release32 osx app_release
32 osx app_releaseKyungryul KIM
ย 
Cocos2dx 7.1-7.2
Cocos2dx 7.1-7.2Cocos2dx 7.1-7.2
Cocos2dx 7.1-7.2Kyungryul KIM
ย 
Cocos2 d x-7.3_4
Cocos2 d x-7.3_4Cocos2 d x-7.3_4
Cocos2 d x-7.3_4Kyungryul KIM
ย 
Cocos2d x-ch5-1
Cocos2d x-ch5-1Cocos2d x-ch5-1
Cocos2d x-ch5-1Kyungryul KIM
ย 
23 drag drop
23 drag drop23 drag drop
23 drag dropKyungryul KIM
ย 
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซ
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซแ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซ
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซKyungryul KIM
ย 
Nib_NSWindowController
Nib_NSWindowControllerNib_NSWindowController
Nib_NSWindowControllerKyungryul KIM
ย 
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2Kyungryul KIM
ย 
Chaper24 languages high_and_low
Chaper24 languages high_and_lowChaper24 languages high_and_low
Chaper24 languages high_and_lowKyungryul KIM
ย 
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆ
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆCh22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆ
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆKyungryul KIM
ย 

More from Kyungryul KIM (20)

Node ch12
Node ch12Node ch12
Node ch12
ย 
11.scripting
11.scripting11.scripting
11.scripting
ย 
32 osx app_release
32 osx app_release32 osx app_release
32 osx app_release
ย 
Meteor ddp
Meteor ddpMeteor ddp
Meteor ddp
ย 
Cocos2dx 7.1-7.2
Cocos2dx 7.1-7.2Cocos2dx 7.1-7.2
Cocos2dx 7.1-7.2
ย 
Cocos2 d x-7.3_4
Cocos2 d x-7.3_4Cocos2 d x-7.3_4
Cocos2 d x-7.3_4
ย 
Cocos2d x-ch5-1
Cocos2d x-ch5-1Cocos2d x-ch5-1
Cocos2d x-ch5-1
ย 
Coco2d x
Coco2d xCoco2d x
Coco2d x
ย 
23 drag drop
23 drag drop23 drag drop
23 drag drop
ย 
Hadoop ch5
Hadoop ch5Hadoop ch5
Hadoop ch5
ย 
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซ
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซแ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซ
แ„Œแ…ฅแ†ซแ„†แ…ฎแ†ซแ„€แ…ฅแ†ทแ„‰แ…ขแ†จแ„€แ…ตแ„‰แ…ฎแ†ฏแ„ƒแ…ฉแ„Œแ…ฅแ†ซ
ย 
Nib_NSWindowController
Nib_NSWindowControllerNib_NSWindowController
Nib_NSWindowController
ย 
Dsas
DsasDsas
Dsas
ย 
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2
แ„‰แ…ฅแ„‡แ…ฅแ„‹แ…ตแ†ซแ„‘แ…ณแ„…แ…กแ„…แ…ณแ†ฏแ„Œแ…ตแ„แ…ขแ†ผแ„’แ…กแ„‚แ…ณแ†ซแ„€แ…ตแ„‰แ…ฎแ†ฏ5 1 2
ย 
Chaper24 languages high_and_low
Chaper24 languages high_and_lowChaper24 languages high_and_low
Chaper24 languages high_and_low
ย 
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆ
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆCh22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆ
Ch22 แ„‹แ…ฎแ†ซแ„‹แ…งแ†ผแ„Žแ…ฆแ„Œแ…ฆ
ย 
Mibis ch20
Mibis ch20Mibis ch20
Mibis ch20
ย 
Mibis ch15
Mibis ch15Mibis ch15
Mibis ch15
ย 
Mibis ch8
Mibis ch8Mibis ch8
Mibis ch8
ย 
Mibis ch4
Mibis ch4Mibis ch4
Mibis ch4
ย 

Ch4 pugixml

  • 1. ์ œ 4์žฅ ๋น›์˜ ์†๋„๋ก โ€จ XML ํŒŒ์‹ฑํ•˜๊ธฐ ๊น€๊ฒฝ๋ ฌ
  • 2. XML ํŒŒ์‹ฑ ๋ชจํ˜•. โ– SAX(Simple API for XML) - ์ŠคํŠธ๋ฆผ๊ณผ โ€˜ํƒœ๊ทธ ์‹œ์ž‘, ๋, ๋ฌธ ์ž์ž๋ฃŒโ€™๋“ฑ์˜ ์ฝœ๋ฐฑ์œผ๋กœ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ.! โ– Pull Parsing - SAX ์œ ์‚ฌ, iterator ๊ฐ์ฒด๋ฅผ ํ†ตํ•ด ์ œ์–ด.! โ– DOM(Document Object Model) - ์ž…๋ ฅ์„ ๋ฌธ์„œ ๊ฐ์ฒด๋กœ ๋ณ€ ํ™˜ํ›„ ์ฒ˜๋ฆฌ.
  • 3. pugixml DOM ํŒŒ์„œ โ– ๋ฉ”๋ชจ๋ฆฌ ์•ˆ์— ๋“ค์–ด๊ฐˆ ์ •๋„๋กœ ์ž‘์€ ๋ฌธ์„œ.! โ– ๋ฐฉ๋ฌธํ•  ๋…ธ๋“œ๋“ค์ด ์„œ๋กœ ์ฐธ์กฐํ•˜๋Š” ๋ณต์žกํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„ ๋ฌธ์„œ! โ– ๋ณต์žกํ•œ ๋ฐฉ์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋Š” ๋ฌธ์„œ.
  • 4. pugixml ์„ค๊ณ„์ƒ์˜ ์„ ํƒ โ– ์•„์ฃผ ๋น ๋ฅด๊ณ  ๊ฐ€๋ฒผ์šด DOM ๊ธฐ๋ฐ˜ XML ์กฐ์ž‘ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋ชฉํ‘œ๋กœ ๊ฐœ๋ฐœ.! โ– ์„ฑ๋Šฅ๊ณผ XML ๊ฒ€์ฆ์˜ ์ ˆ์ถฉ์ .! โ– well-formed ๊ฒ€์ฆ.! โ– DTD(Document Type Declaration)์€ ๊ฒ€์ฆ ์•Šํ•จ.! โ– ์ข…์ข… well-formed ๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ๋„ ์„ฑ๊ณต์œผ๋กœ ์ฒ˜๋ฆฌ.
  • 5. ํŒŒ์‹ฑ โ– ํ† ํฐ ์ŠคํŠธ๋ฆผ ๋Œ€์‹  ๋ฌธ์ž ์ŠคํŠธ๋ฆผ์— ๋Œ€ํ•ด ํŒŒ์‹ฑ์„ ์ˆ˜ํ–‰.! โ– UTF-8 ๋ฌธ์ž๋งŒ ์ง€์›.! โ– ์ œ์ž๋ฆฌ ํŒŒ์‹ฑ(In-place parsing) - ์ŠคํŠธ๋ฆผ์— ์žˆ๋Š” ์ž๋ฃŒ๋ฅผ ์ง ์ ‘ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ์‹.โ€จ ๋ฌธ์ž์—ด ๋ณต์‚ฌ๋ฅผ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด.
  • 6. In-place parsing โ– ๋ฌธ์ž์—ด์„ ๋งŒ๋‚˜๋ฉด ๊ทธ ๋ฌธ์ž์—ด์˜ ํฌ์ธํ„ฐ์™€ ๊ธธ์ด๋ฅผ ์ €์žฅ.! โ– ์„ฑ๋Šฅ ์ข‹์•„์ง€์ง€๋งŒ, ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์€ ์ฆ๊ฐ€.โ€จ - ์›๋ณธ ์ŠคํŠธ๋ฆผ ์œ ์ง€. < n > T h e n o d e t e x t < / n > ํฌ์ธํ„ฐ0xabc3, ๊ธธ์ด 130xabc0
  • 7. In-place parsing - ๋„๋ฌธ์ž ์ฒ˜๋ฆฌ โ– ๋ฌธ์ž์—ด ์ ‘๊ทผ์„ ๋ณด๋‹ค ๋น ๋ฅด๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋„๋ฌธ์ž๋ฅผ ์‚ฝ์ž….! โ– XML์€ ๋ฌธ์ž์—ด ๋ ๋‹ค์Œ ๋ฌธ์ž๋Š” < ๊ธฐํ˜ธ๋กœ ๊ตฌ๋ถ„๋จ. < n > T h e n o d e t e x t 0 / n > ํฌ์ธํ„ฐ0xabc3, ๊ธธ์ด 130xabc0
  • 8. In-place parsing - ๋ฌธ์ž ํ‘œํ˜„ ์ฒ˜๋ฆฌ โ– ๋ฌธ์ž์—ด์ด ํ‘œํ˜„๊ณผ ๋‹ค๋ฅธ ๊ฒฝ์šฐ ํŒŒ์‹ฑ์ค‘ ์ฒ˜๋ฆฌ.! โ– `line1xDxAline2xDline3xAxA` ์„โ€จ line1xAline2xAline3xAxA` ๋กœ ๋ณ€ํ™˜.! โ– ๋ฌธ์ž ์ฐธ์กฐ ํ™•์žฅ - &#97; ์„ a ๋กœ ๋ณ€ํ™˜.! โ– ๊ฐœ์ฒด ์ฐธ์กฐ ํ™•์žฅ - &lt; (<), &gt;(>), &quot;(โ€œ), &apos;(โ€˜); &amp;(&)! โ– ํŠน์„ฑ ๊ฐ’ ์ •๊ทœํ™”(Attribute-value normalization) - ๋ชจ๋“  ๊ณต ๋ฐฑ๋ฌธ์ž๋ฅผ ๋นˆ์นธ์œผ๋กœ ๋ณ€ํ™˜.
  • 9. In-place parsing - ๋ฌธ์ž ํ‘œํ˜„ ์ฒ˜๋ฆฌ โ– ๋ณ€ํ™˜ ๋•Œ๋ฌธ์— ๋ฌผ์ž์—ด์ด ๋” ๊ธธ์–ด์ ธ์„œ๋Š” ์•ˆ๋œ๋‹ค.! โ– ๋ณ€ํ™˜ ๊ฒฐ๊ณผ๊ฐ€ ๋” ๊ธธ๋ฉด ๋ฌธ์„œ ์ž๋ฃŒ๋ฅผ ๋ฎ์–ด ์“ธ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ. < n > A & # 3 2 ; & l t ; B . < / n >0xabc0 < n > A < B . 0 l t ; B . < / n >0xabc0
  • 10. In-place parsing - Copy-on-Write โ– memory-mapped ๏ฌle I/O ์„ ์‚ฌ์šฉ.! โ– ๋„์ข…๋ฃŒ์™€ ํ…์ŠคํŠธ ๋ณ€ํ™˜์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด Copy-on-Write ๋ฐฉ์‹์„ ์  ์šฉํ•˜์—ฌ ์›๋ณธ ํŒŒ์ผ์ด ๋ณ€๊ฒฝ๋˜๋Š” ๊ฒƒ์„ ๋ง‰์Œ.! โ– ํ”„๋กœ์„ธ์Šค ์ฃผ์†Œ ๊ณต๊ฐ„์— ์ง์ ‘ ๋Œ€์‘ ์‹œํ‚ค๋ฏ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ๋ณต์‚ฌ๋ฅผ ํ”ผํ•  ์ˆ˜ ์žˆ์Œ.! โ– ํŒŒ์ผ์ด ์บ์‹œ๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์ปค๋„์ด ๋กœ๋”ฉํ•˜๋ฏ€๋กœ ์ž…์ถœ๋ ฅ๊ณผ ํŒŒ์‹ฑ์ด ๋ณ‘๋ ฌ์ ์œผ๋กœ ์ง„ํ–‰.! โ– ์ˆ˜์ •๋œ ํŽ˜์ด์ง€๋งŒ ๋ฌผ๋ฆฌ์  ๋ฉ”๋ชจ๋ฆฌ์— ํ• ๋‹น๋˜๋ฏ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ์†Œ๋น„๋ฅผ ์ค„ ์ผ ์ˆ˜ ์žˆ์Œ.
  • 11. ๋ฌธ์ž๋ณ„ ์—ฐ์‚ฐ์˜ ์ตœ์ ํ™”โ€จ Optimizing character-wise operations โ– ๋ฌธ์ž ํ•˜๋‚˜์— ์†Œ๋น„๋œ ํ‰๊ท  ํ”„๋กœ์„ธ์„œ ์ฃผ๊ธฐ(cycle) ์ˆ˜์ด๋‹ค.! โ– ๋ฌธ์ž ์ง‘ํ•ฉ ์†Œ์† ์—ฌ๋ถ€ ๊ฒ€์ถœโ€จ ํ•œ ๋ฌธ์ž๊ฐ€ ์–ด๋–ค ๋ฌธ์ž ์ง‘ํ•ฉ์— ์†ํ•˜๋Š”์ง€ ํŒ์ •ํ•˜๋Š” ๊ฒƒ.
  • 12. ! enum chartype_t! ! {! ! ! ct_parse_pcdata = 1,! // 0, &, r, <! ! ! ct_parse_attr = 2,!! // 0, &, r, ', "! ! ! ct_parse_attr_ws = 4,! // 0, &, r, ', ", n, tab! ! ! ct_space = 8,! ! ! // r, n, space, tab! ! ! ct_parse_cdata = 16,! // 0, ], >, r! ! ! ct_parse_comment = 32,!// 0, -, >, r! ! ! ct_symbol = 64,! ! ! // Any symbol > 127, a-z, A-Z, 0-9, _, :, -, .! ! ! ct_start_symbol = 128! // Any symbol > 127, a-z, A-Z, _, :! ! };! ! ! static const unsigned char chartype_table[256] =! ! {! ! ! 55, 0, 0, 0, 0, 0, 0, 0, 0, 12, 12, 0, 0, 63, 0, 0, // 0-15! ! ! 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 16-31! ! ! 8, 0, 6, 0, 0, 0, 7, 6, 0, 0, 0, 0, 0, 96, 64, 0, // 32-47! ! ! 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 192, 0, 1, 0, 48, 0, // 48-63! ! ! 0, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 64-79! ! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 0, 0, 16, 0, 192, // 80-95! ! ! 0, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 96-111! ! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 0, 0, 0, 0, 0, // 112-127! ! ! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, // 128+! ! ! โ€ฆ โ€ฆ โ€ฆ! ! ! 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192, 192! ! };! ! ! bool ischartype_utf8(char c, chartype_t ct){! ! ! return ct & chartype_table[(unsigned char)c];! ! }!
  • 13. ํŠน์ • ๊ตฌ๊ฐ„์˜ ๋ชจ๋“  ๋ฌธ์ž โ– ์ฃผ์–ด์ง„ ๋ฌธ์ž๊ฐ€ ์ˆซ์ž์ธ์ง€ ํŒ์ •ํ•˜๋Š” ํ•จ์ˆ˜! โ– bool isdigit(char ch) { return (ch >= '0' && ch <= '9'); }! โ– bool isdigit(char ch) { return (unsigned)(ch - '0') < 10; }
  • 14. UTF-8 ๋ฐ”์ดํŠธ์—ด โ– ์—ฐ์†๋œ 4 ๋ฐ”์ดํŠธ๊ฐ€ ASCII ๊ธฐํ˜ธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” UTF-8 ๋ฐ”์ดํŠธ ์—ด์ธ์ง€ ํŒ๋ณ„ํ•˜๋Š” ์ฝ”๋“œ.! โ– (*(const uint32_t*)data & 0x80808080) == 0
  • 15. ํ‘œ์ค€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ is*() ํ•จ์ˆ˜ โ– ์„ฑ๋Šฅ์ด ์ค‘์š”ํ•œ ์ฝ”๋“œ์—์„œ๋Š” isalpha()๋“ฑ์„ ํ”ผํ•ด์•ผ ํ•จ.! โ– locale ์ด โ€œCโ€ ์ธ์ง€ ์ ๊ฒ€ํ•˜๋Š” ๊ณผ์ •๋•Œ๋ฌธ.
  • 16. ๋ฌธ์ž์—ด ๋ณ€ํ™˜์˜ ์ตœ์ ํ™” Optimizing string transformations โ– ๋ฌธ์ž์—ด ๊ฐ’์„ ์ฝ๊ณ  ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •์—์„œ ์‹œ๊ฐ„ ์†Œ๋น„๊ฐ€ ํฌ๋‹ค.! โ– A&#32;&lt; B.! โ– A < B.
  • 17. PCDATA ํŒŒ์‹ฑ ํ•จ์ˆ˜ โ– bool ํ”Œ๋ž˜๊ทธ 2๊ฐœ -> 4๊ฐœ ๋ณ€ํ˜•.! โ– ๋ฌธ์ž๋“ค์€ ๋ฌธ์ž ์ง‘ํ•ฉ ํŒ์ • ์ด ์šฉ. template <bool opt_eol, bool opt_escape> struct! ! strconv_pcdata_impl {! ! static char_t* parse(char_t* s) {! ! gap g;! ! while (true) {! ! while (!PUGI__IS_CHARTYPE(*s, ct_parse_pcdata)) ++s;! ! if (*s == '<') { // PCDATA ends here! ! *g.flush(s) = 0;! ! return s + 1;! ! } else if (opt_eol && *s == 'r') { // 0x0d or 0x0d 0x0a pair! ! *s++ = 'n'; // replace first one with 0x0a! ! if (*s == 'n') g.push(s, 1);! ! } else if (opt_escape && *s == '&') {! ! s = strconv_escape(/s, g);! ! } else if (*s == 0) {! ! return s;! ! } else {! ! ++s;! ! }! ! }! ! }! ! };!
  • 18. PCDATA ํ‹ˆ(GAP) ๊ด€๋ฆฌ โ– &quot; ๋ฅผ โ€œ ๋กœ ๋Œ€์ฒดํ•˜๋ฉด ๋ฌธ์ž ๋‹ค์„ฏ๊ฐœ์˜ ํ‹ˆ์ด ์ƒ๊น€.! โ– ๋‘ ํ‹ˆ์„ ๋ณ‘ํ•ฉ - ๊ธฐ์กด ํ‹ˆ๊ณผ ์ƒˆ ํ‹ˆ ์‚ฌ์ด ์ž๋ฃŒ๋ฅผ ์•ž์œผ๋กœ ์˜ฎ๊น€.! โ– ์ฝ๊ธฐ/์“ฐ๊ธฐ ํฌ์ธํ„ฐ๋ณด๋‹ค ์ข€ ๋” ๋น ๋ฅด๊ฒŒ ๋ณ‘ํ•ฉ(memmove)
  • 19. ์ œ์–ด ํ๋ฆ„์˜ ์ตœ์ ํ™” Optimizing control flow โ– ์žฌ๊ท€์  ํ•˜๊ฐ• ํŒŒ์„œ(recursive-descent parser) ํ˜•ํƒœ์—์„œ ์„ฑ ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ์žฌ๊ท€๋ฅผ ๋ฐ˜๋ณต ๋ฃจํ”„๋กœ ๋ณ€๊ฒฝ.! โ– ๋…ธ๋“œ ์ปค์„œ๋Š” ์Šคํƒ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘.! โ– ์Šคํƒ ๊ณต๊ฐ„ ์†Œ๋น„๋Ÿ‰์ด ์ž…๋ ฅ ๋ฌธ์„œ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์ผ์ •.! โ– ์•ˆ์ •์„ฑ์„ ์ฆ๊ฐ€.! โ– ์ž ์žฌ์ ์ธ ๋น„์‹ผ ํ•จ์ˆ˜ ํ˜ธ์ถœ์„ ํ”ผํ•จ. ???
  • 20. ๋ถ„๊ธฐ ์ˆœ์„œ์™€ ์ฝ”๋“œ ๊ตญ์†Œ์„ฑ โ– ์ž์ฃผ ์‹คํ–‰๋˜๋Š” ๋ถ€๋ถ„(ํƒœ๊ทธ์ด๋ฆ„,์†์„ฑ)๊ณผ ๊ฑฐ์˜ ์‹คํ–‰๋˜์ง€ ์•Š ๋Š” ๋ถ€๋ถ„(DOCTYPE)! โ– ์ฒ˜๋ฆฌ ํ™•๋ฅ  - โ€˜<โ€˜ ๋ฌธ์ž ๋‹ค์Œ โ€˜ํƒœ๊ทธ ์ด๋ฆ„โ€™, โ€˜/โ€˜, โ€˜!โ€™, โ€˜?โ€™ ์ˆœ์˜ ํ™•๋ฅ  ๋กœ ๋‚˜ํƒ€๋‚จ.! โ– ์ฝ”๋“œ ์กฐ๊ฐ๋“ค์˜ ํ™•๋ฅ ์— ๋”ฐ๋ผ ์žฌ๋ฐฐ์น˜! โ– ์ธ๋ผ์ธ ์ฝ”๋“œ๋Ÿ‰์„ ์ œํ•œ.
  • 21. โ– ์กฐ๊ฑด ๋ถ„๊ธฐ๋“ค์„ ํ™•๋ฅ ์ด ๋†’์€ ๊ฑฐ์—์„œ ๋‚ฎ์€ ๊ฒƒ ์ˆœ์„œ๋กœ ์žฌ๋ฐฐ์น˜.! โ– ํ‰๊ท ์ ์ธ ์กฐ๊ฑด ํŒ์ • ๋ฐ ๋ถ„๊ธฐ ์ˆ˜ํ–‰ ํšŸ์ˆ˜๊ฐ€ ์ตœ์†Œํ™”. ! if (data[0] == '<')! ! {! ! if (data[1] == '!') { ... }! ! else if (data[1] == '/') { ... }! ! else if (data[1] == '?') { ... }! ! else { /* start-tag or unrecognized tag */ }! ! }! ! ! if (data[0] == '<')! ! {! ! if (PUGI__IS_CHARTYPE(data[1], ct_start_symbol)) { /* start-tag */ }! ! else if (data[1] == '/') { ... }! ! else if (data[1] == '!') { ... }! ! else if (data[1] == '?') { ... }! ! else { /* unrecognized tag */ }! ! }!
  • 22. ๋ฉ”๋ชจ๋ฆฌ ์•ˆ์ •์„ฑ ๋ณด์žฅ โ– ๋ฒ„ํผ ์˜ค๋ฒ„ํ”Œ๋กœ์šฐ๋ฅผ ๋ง‰๊ธฐ ์œ„ํ•ด.! โ– ํ˜„์žฌ ์ฝ๊ธฐ์™€ ๋ฒ„ํผ ๋์„ ๋น„๊ต.! โ– ํ•˜๋‚˜์˜ ๋ ˆ์ง€์Šคํ„ฐ๊ฐ€ ๋” ํ•„์š”.! โ– ํ•จ์ˆ˜ ํ˜ธ์ถœ์‹œ ํ˜„์žฌ ์œ„์น˜์™€ ๋์œ„์น˜๋ฅผ ์ „๋‹ฌํ•  ํฌ์ธํ„ฐ ํ•„์š”! โ– ๋„๋ฌธ์ž ์ฒ˜๋ฆฌ.! โ– ์ž…๋ ฅ ๋ฒ„ํผ์™€ ๋ฒ„ํผ ํฌ๊ธฐ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๊ฒฝ์šฐ ์‚ฌ์šฉ์˜ ๋ถˆํŽธํ•จ ๋ฐœ์ƒ.
  • 23. DOM ์ž๋ฃŒ ๊ตฌ์กฐ โ– ์—ฐ๊ฒฐ ๋ชฉ๋ก ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜ ๋Š” ๋…ธ๋“œ ์ˆ˜์ •์€ O(1).! โ– ์—ฐ๊ฒฐ ๋ชฉ๋ก ์ ‘๊ทผ ๋ฐฉ์‹- ๊ณ ์ • ํฌ๊ธฐ ํ•  ๋‹น์„ ์œ„ํ•œ ๋น ๋ฅธ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น์ž๋ฅผ ์„ค ๊ณ„ํ•˜๋Š” ๊ฒƒ์ด ์ž„์˜ ํฌ๊ธฐ ํ• ๋‹น์ž๋ณด๋‹ค ์‰ฝ๋‹ค.! โ– ๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ - last_child ์ œ๊ฑฐ, prev_sibling_cyclic ์œผ๋กœ ๋Œ€์ฒดO(1). struct Node {! Node* first_child;! Node* last_child;! Node* prev_sibling;! Node* next_sibling;! };! struct Node {! Node* first_child;! Node* prev_sibling_cyclic;! Node* next_sibling;! };!
  • 24. Node* last_child(Node* node) {! return (node->first_child) ?! node->first_child->prev_sibling_cyclic : NULL;! }! ! Node* prev_sibling(Node* node) {! return (/node->prev_sibling_cyclic->next_sibling) ?! node->prev_sibling_cyclic : NULL;! }!
  • 25. ์Šคํƒ ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น โ– ๊ฐ€๋ณ€ ํฌ๊ธฐ ๋ฌธ์ž์—ด ํ• ๋‹น.! โ– ํ• ๋‹น ๊ตญ์†Œ์„ฑ์„ ์œ ์ง€.! โ– DOM ํŒŒ๊ดด ์†๋„์„ ์œ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ.
  • 26. ์Šคํƒ ํ• ๋‹น์ž const size_t allocator_page_size = 32768;! struct allocator_page {! allocator_page* next_page;! size_t offset;! char data[allocator_page_size];! };! struct allocator_state {! allocator_page* current;! };! ! void* allocate_new_page_data(size_t size) {! size_t extra_size = (size > allocator_page_size) ?! size - allocator_page_size : 0;! return malloc(sizeof(allocator_page) + extra_size);! } void* allocate_oob(allocator_state* state, size_t size) {! allocator_page* page = (allocator_page*)allocate_new_page_data(siz e);! // add page to page list! page->next_page = state->current;! state->current = page;! // user data is located at the beginning of the page! page->offset = size;! return page->data;! }! ! void* allocate(allocator_state* state, size_t size) {! if (state->current->offset + size <= allocator_page_size) {! void* result = state->current->data + state->current->offset;! state->current->offset += size;! return result;! }! return allocate_oob(state, size);! }!
  • 27. ์Šคํƒ ๊ธฐ๋ฐ˜ ํ• ๋‹น์ž์˜ ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ ์ง€์› โ– ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ์™€ ์žฌ์‚ฌ์šฉ์„ ์œ„ํ•ด ์ฐธ์กฐ ์นด์šดํŠธ ๋ฐฉ์‹ ๋„์ž….! โ– ๋ชจ๋“  ํŽ˜์ด์ง€๋Š” 32๋ฐ”์ดํŠธ ๊ฒฝ๊ณ„๋กœ ์ •๋ ฌ๋˜๊ณ  ๋ชจ๋“  ํŽ˜์ด์ง€ ํฌ์ธํ„ฐ์˜ ํ•˜์œ„ ๋‹ค์„ฏ ๋น„ํŠธ๋Š” ํ•ญ์ƒ 0์ด๋‹ค. ???! โ– 5๋น„ํŠธ์— XML ๋ฉ”ํƒ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅ.! โ– ํ• ๋‹น๋œ ์š”์†Œ์˜ ์œ„์น˜๋ฅผ ํŽ˜์ด์ง€ ์‹œ์ž‘ ์œ„์น˜๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์˜คํ”„์…‹์œผ๋กœ ์ €์žฅ.! โ– ํŽ˜์ด์ง€ ํฌ์ธํ„ฐ์˜ ์ฃผ์†Œ => โ€จ (allocator_page*)((char*)(object) -object->offset - offsetof(allocator_page, data))