Perl Memory Use - LPW2013
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Perl Memory Use - LPW2013

on

  • 1,616 views

Slides for my talk at the London Perl Workshop in Nov 2013, featuring the Devel::SizeMe perl module.

Slides for my talk at the London Perl Workshop in Nov 2013, featuring the Devel::SizeMe perl module.

See also the screencast at https://archive.org/details/Perl-Memory-Profiling-LPW2013

Statistics

Views

Total Views
1,616
Views on SlideShare
1,553
Embed Views
63

Actions

Likes
6
Downloads
5
Comments
0

2 Embeds 63

https://twitter.com 61
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Perl Memory Use - LPW2013 Presentation Transcript

  • 1. Perl Memory Use Tim Bunce @ London Perl Workshop 2013
  • 2. Ouch! $ perl some_script.pl Out of memory! $ $ perl some_script.pl Killed. $ $ perl some_script.pl $ Someone shouts: "Hey! My process has been killed!" $ perl some_script.pl [...later...] "Why is it taking so long?"
  • 3. Process Memory
  • 4. $ perl -e 'system("cat /proc/$$/stat")' # $$ = pid 4752 (perl) S 4686 4752 4686 34816 4752 4202496 536 0 0 0 0 0 0 0 20 0 1 0 62673440 123121664 440 18446744073709551615 4194304 4198212 140735314078128 140735314077056 140645336670206 0 0 134 0 18446744071579305831 0 0 17 10 0 0 0 0 0 0 0 0 0 0 4752 111 111 111 $ perl -e 'system("cat /proc/$$/statm")' 30059 441 346 1 0 160 0 $ perl -e 'system("ps -p $$ -o vsz,rsz,sz,size")' VSZ RSZ SZ SZ 120236 1764 30059 640 $ perl -e 'system("top -b -n1 -p $$")' ... PID USER PR NI VIRT RES SHR S %CPU %MEM 13063 tim 20 0 117m 1764 1384 S 0.0 0.1 TIME+ COMMAND 0:00.00 perl $ perl -e 'system("cat /proc/$$/status")' ... VmPeak:! 120236 kB VmSize:! 120236 kB <- total (code, libs, stack, heap etc.) VmHWM:! 1760 kB VmRSS:! 1760 kB <- how much of the total is resident in physical memory VmData:! 548 kB <- data (heap) VmStk:! 92 kB <- stack VmExe:! 4 kB <- code VmLib:! 4220 kB <- libs, including libperl.so VmPTE:! 84 kB VmPTD:! 28 kB VmSwap:! 0 kB ... Further info on unix.stackexchange.com
  • 5. C Program Code int main(...) { ... } Read-only Data eg “String constants” Read-write Data un/initialized variables Heap (not to scale!) Shared Lib Code Shared Lib R/O Data repeated for each lib Shared Lib R/W Data // C Stack System (not the perl stack)
  • 6. $ perl -e 'system("cat /proc/$$/maps")' address perms ... pathname 00400000-00401000 r-xp ... /.../perl-5.NN.N/bin/perl 00601000-00602000 rw-p ... /.../perl-5.NN.N/bin/perl 0087f000-008c1000 rw-p ... [heap] 7f858cba1000-7f8592a32000 r--p ... /usr/lib/locale/locale-archive-rpm 7f8592c94000-7f8592e1a000 7f8592e1a000-7f859301a000 7f859301a000-7f859301e000 7f859301e000-7f859301f000 7f859301f000-7f8593024000 r-xp ---p r--p rw-p rw-p ... ... ... ... ... /lib64/libc-2.12.so /lib64/libc-2.12.so /lib64/libc-2.12.so /lib64/libc-2.12.so r-xp ---p rw-p rw-p ... ... ... ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so ...other libs... 7f8593d1b000-7f8593e7c000 7f8593e7c000-7f859407c000 7f859407c000-7f8594085000 7f85942a6000-7f85942a7000 7fff61284000-7fff6129a000 rw-p ... [stack] 7fff613fe000-7fff61400000 r-xp ... [vdso] ffffffffff600000-ffffffffff601000 r-xp ... [vsyscall]
  • 7. $ perl -e 'system("cat /proc/$$/smaps")' # note ‘smaps’ not ‘maps’ address ... perms ... pathname 7fb00fbc1000-7fb00fd22000 r-xp ... /.../5.10.1/x86_64-linux/CORE/libperl.so Size: 1412 kB <- size of executable code in libperl.so Rss: 720 kB <- amount that's currently in physical memory Pss: 364 kB Shared_Clean: 712 kB Shared_Dirty: 0 kB Private_Clean: 8 kB Private_Dirty: 0 kB Referenced: 720 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB ... repeated for every segment ... ... repeated for every segment ...
  • 8. Memory Pages ✦ Process view: ✦ ✦ Large continuous regions of memory. Simple. Operating System view: ✦ Memory is divided into pages ✦ Pages are loaded to physical memory on demand ✦ Mapping can change without the process knowing
  • 9. C Program Code Read-only Data Read-write Data Memory is divided into pages Page size is typically 4KB Heap ← Page ‘resident’ in physical memory ← Page not resident Shared Lib Code Shared Lib R/O Data Shared Lib R/W Data C Stack System RSS “Resident Set Size” is how much process memory is currently in physical memory
  • 10. Key Point ✦ Don’t use Resident Set Size (RSS) ✦ ✦ ✦ Unless you really want to know what’s currently resident. It can shrink even while the process size grows. Heap size or Total memory size is a good indicator.
  • 11. Malloc and The Heap
  • 12. “Malloc and The Heap”
  • 13. Heap ← Your perl stuff goes here
  • 14. malloc manages memory allocation Heap perl data malloc() requests big chunks of memory from the operating system as needed. Almost never returns it! Perl makes lots of malloc and free requests. Freed fragments of various sizes accumulate.
  • 15. Your Data
  • 16. Perl Data Anatomy Integer (IV) String (PV) Number with a string Head Body Data Illustrations from illguts
  • 17. Array (IV) Hash (HV)
  • 18. Glob (GV) Symbol Table (Stash) Sub (CV) lots of tiny chunks!
  • 19. Devel::Peek • Gives you a textual view of data $ perl -MDevel::Peek -e '%a = (42 => "Hello World!"); Dump(%a)' SV = IV(0x1332fd0) at 0x1332fe0 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x1346730 SV = PVHV(0x1339090) at 0x1346730 REFCNT = 2 FLAGS = (SHAREKEYS) ARRAY = 0x1378750 (0:7, 1:1) KEYS = 1 FILL = 1 MAX = 7 Elt "42" HASH = 0x73caace8 SV = PV(0x1331090) at 0x1332de8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x133f960 "Hello World!"0 CUR = 12 <= length in use LEN = 16 <= amount allocated
  • 20. Devel::Size • Gives you a measure of the size of a data structure $ perl -MDevel::Size=total_size -le 'print total_size( 0 )' 24 $ perl -MDevel::Size=total_size -le 'print total_size( [] )' 64 $ perl -MDevel::Size=total_size -le 'print total_size( {} )' 120 $ perl -MDevel::Size=total_size -le 'print total_size( [ 1..100 ] )' 3264 • • • Created by Dan Sugalski, now maintained by Nicholas Clark Is very fast, and accurate for most simple data types. Has limitations and bugs, but is the best tool we have.
  • 21. Arenas Heads and Bodies are allocated from ‘arenas’ (slabs) managed by perl. One for SV heads an one for each size of SV body. More efficient than malloc in space and speed. Introspect arenas with Devel::Arena and Devel::Gladiator. $ perl -MDevel::Gladiator=arena_table -e 'warn arena_table()' ARENA COUNTS: 1063 SCALAR 199 GLOB 120 ARRAY 95 CODE 66 HASH ...
  • 22. Key Notes ✦ All variable length data storage comes from malloc ✦ ✦ Heads and Bodies are allocated from ‘arenas’ managed by perl ✦ ✦ malloc has overheads, bucket and fragmentation issues Arenas have less overhead but are never freed Memory usage will always be higher than the sum of the sizes.
  • 23. Memory Profiling
  • 24. Memory Profiling? ✦ Track memory size over time? ✦ ✦ Experiments with Devel::NYTProf ✦ ✦ See where memory is allocated and freed? Turned out to not be very useful Need to know what is ‘holding’ memory.
  • 25. Space in Hiding ✦ Perl tends to consume extra memory to save time ✦ This can lead to surprises, for example: ✦ ✦ sub foo { my $var = "X" x 10_000_000; } foo(); # ~20MB still used after return! sub bar{ my $var = "X" x 10_000_000; bar($_[0]-1) if $_[0]; # recurse } bar(50); # ~1GB still used after return!
  • 26. X-Ray Vision! ✦ Want to see inside the black box ✦ Want to know “where memory is being held” ✦ A snapshot “crawl and dump” approach ✦ Separate capture from analysis
  • 27. My Plan ✦ ✦ ✦ ✦ ✦ ✦ ✦ ✦ (circa 2012) Extend Devel::Size Add a C-level callback hook Add some kind of "data path name" mechanism Add a function to return the size of everything Stream the data to disk Write tools to manipilate, summarize & query the data Write tools to visualize the data Write tools to compare sets of data
  • 28. Devel::SizeMe ✦ ✦ ✦ ✦ ✦ ✦ Fork of Devel::Size Still very experimental Lots of hacks and rough edges Some deep refactoring needed Still exploring what’s possible ... but it seems useful now
  • 29. Devel::SizeMe Outputs ✦ ✦ ✦ ✦ ✦ ✦ ✦ Text - handy for testing and simple structures Graphviz - useful visualization for up to ~1000 nodes Treemap - useful for simple top-down view (“blame”) Gephi - full network view (structure, relationships) SQLite db Very little analysis implemented yet Ref-loops are isolated from “owners”
  • 30. Devel::SizeMe sizeme_store SQLite db sizeme_graph Text Text Graphviz (dot) GEXF ??? Gephi Treemap in browser
  • 31. Demonstration
  • 32. See https://archive.org/details/Perl-Memory-Profiling-LPW2013
  • 33. Devel::SizeMe Summary ✦ Focussed on memory use ✦ Walks trees of pointers in perl internals ✦ Can dump individual data structures ✦ Stream-based - scales to any size of application ✦ Multiple output formats ✦ Very minimal and informal data model
  • 34. Current Limitations ✦ Very minimal and informal data model ✦ Ref loops gets separated out ✦ Accumulating sizes up tree happens too soon ✦ Can’t edit the tree without invalidating sizes ✦ Needs a multi-phase processing pipeline ✦ Needs a more task-oriented user interface
  • 35. Recommendations ✦ Store the data in some kind of database ✦ Perform transformations on the database data ✦ Generate UI from the database - scalability ✦ Express queries as db queries - flexibility ✦ What kind of database? Relational or Graph?
  • 36. Possible Futures ✦ Feed Devel::MAT data into SQLite ✦ Feed SQLite data into Neo4j ✦ Develop useful Cypher query fragments ✦ Develop graph simplifications as plugins ✦ Develop visualizations
  • 37. Questions? Tim.Bunce@pobox.com http://blog.timbunce.org @timbunce