Hyojeong Lee
Distributed Computing System Laboratory
Department of Computer Science and Engineering
Seoul National University, Korea
Progress Report
● DONE
● Solve remaining problems
● Evaluate w/ benchmarks
An Efficient Garbage Collection in Java Virtual Machine
via Swap I/O Optimization (submitted in APSys’19)
DONE
Solve remaining problems
Marking (multi-thr) → Summary (single-thr)
- mark_live_object (to bitmap)
- check_skippable_region
- fill_and_mark_holes_with_dummy
- map_src_dest (skip swapped region’s mapping)
Compact (multi-thr)
- update_object_ptr (ptr of skippable region will be
updated to itself)
- object is unmarked
- dense_prefix_task (update dense prefix)
- dense_prefix is unavailable
Post-compact (multi-thr)
- verify_ptrs (check all object ptr is available)
SPECjvm2008
● derby
● scimark.fft.large
● serial
● crypto.aes
● xml.validation
● sunflow
● mpegaudio
● compress
(DBG) unmarked 문제
→ find_next_obj() 수정
(DBG) dp의 concurrency 문제
→ 스캔 대상에서 제외
(OPT) mark_region을 summarize_quick() 및 dp가 정해진
후에 수행해서 scan 범위를 dp부터 end로 줄임
Evaluation - overview
● Comparing to related works,
● focusing on the case that swapping occurs
● handle swapped data using the information from OS directly
● without changes of application code
● Target: industry-standard benchmarks
● SPECjvm2008
● DaCapo
SPECjvm2008
● derby
● scimark.fft.large
● serial
● crypto.aes
● xml.validation
● sunflow
● mpegaudio
● compress
● scimark.lu.large
● scimark.sparse.large
DaCapo
● eclipse
● jython
● lusearch
● pmd
● xalan
● avrora
● h2
● lusearch-fix
● tradebeans
● tradesoap
Evaluation - FGC rate
● Improvement ranges from none to 68.7% and average is 25.5%
Evaluation - throughput (SPECjvm2008)
● Derby,
amount swap I/O reduced by
47.8% and 56.5%:
69.1GB / 59.5GB
→ 36.1GB / 25.9GB
● Improvement for 88.3% at best
and 15.9% on average.
Evaluation - execution time (DaCapo)
● Improvement for 26.9% at best and 12.7% on average.
● Our proposed policy,
● good for xml.validation and eclipse.
● due to many long-lived objects in the old generation.
● not for sunflow, compress, lusearch, pmd, and xalan.
● create short-lived objects so that there are few FGC (but YGC).
● some shows improvement on FGC rate but not on throughput (e.g. serial, scimark.lu.large).
● due to fragmentation on the heap.
● Future works
● 커널 API 최적화
● Fine-grained compaction (region → object)
TODO
A B C
DC_COUNT = 1DC_COUNT = 1 DC_COUNT = 1
- (기존) B copy its objects to A → B decrement DC_COUNT(atomic)
→ then, C can do copy
- 이를 오브젝트 단위로 수행하면 앞선 fragmentation 문제도 해결 가능

Paper_An Efficient Garbage Collection in Java Virtual Machine via Swap I/O Optimization

  • 1.
    Hyojeong Lee Distributed ComputingSystem Laboratory Department of Computer Science and Engineering Seoul National University, Korea Progress Report
  • 2.
    ● DONE ● Solveremaining problems ● Evaluate w/ benchmarks An Efficient Garbage Collection in Java Virtual Machine via Swap I/O Optimization (submitted in APSys’19) DONE
  • 3.
    Solve remaining problems Marking(multi-thr) → Summary (single-thr) - mark_live_object (to bitmap) - check_skippable_region - fill_and_mark_holes_with_dummy - map_src_dest (skip swapped region’s mapping) Compact (multi-thr) - update_object_ptr (ptr of skippable region will be updated to itself) - object is unmarked - dense_prefix_task (update dense prefix) - dense_prefix is unavailable Post-compact (multi-thr) - verify_ptrs (check all object ptr is available) SPECjvm2008 ● derby ● scimark.fft.large ● serial ● crypto.aes ● xml.validation ● sunflow ● mpegaudio ● compress (DBG) unmarked 문제 → find_next_obj() 수정 (DBG) dp의 concurrency 문제 → 스캔 대상에서 제외 (OPT) mark_region을 summarize_quick() 및 dp가 정해진 후에 수행해서 scan 범위를 dp부터 end로 줄임
  • 4.
    Evaluation - overview ●Comparing to related works, ● focusing on the case that swapping occurs ● handle swapped data using the information from OS directly ● without changes of application code ● Target: industry-standard benchmarks ● SPECjvm2008 ● DaCapo SPECjvm2008 ● derby ● scimark.fft.large ● serial ● crypto.aes ● xml.validation ● sunflow ● mpegaudio ● compress ● scimark.lu.large ● scimark.sparse.large DaCapo ● eclipse ● jython ● lusearch ● pmd ● xalan ● avrora ● h2 ● lusearch-fix ● tradebeans ● tradesoap
  • 5.
    Evaluation - FGCrate ● Improvement ranges from none to 68.7% and average is 25.5%
  • 6.
    Evaluation - throughput(SPECjvm2008) ● Derby, amount swap I/O reduced by 47.8% and 56.5%: 69.1GB / 59.5GB → 36.1GB / 25.9GB ● Improvement for 88.3% at best and 15.9% on average.
  • 7.
    Evaluation - executiontime (DaCapo) ● Improvement for 26.9% at best and 12.7% on average. ● Our proposed policy, ● good for xml.validation and eclipse. ● due to many long-lived objects in the old generation. ● not for sunflow, compress, lusearch, pmd, and xalan. ● create short-lived objects so that there are few FGC (but YGC). ● some shows improvement on FGC rate but not on throughput (e.g. serial, scimark.lu.large). ● due to fragmentation on the heap.
  • 8.
    ● Future works ●커널 API 최적화 ● Fine-grained compaction (region → object) TODO A B C DC_COUNT = 1DC_COUNT = 1 DC_COUNT = 1 - (기존) B copy its objects to A → B decrement DC_COUNT(atomic) → then, C can do copy - 이를 오브젝트 단위로 수행하면 앞선 fragmentation 문제도 해결 가능