2. • 7 gigabyte Pharo image is “the largest comfortable
Pharo image” [1].
• The largest GemStone production image is a 1.5
terabytes.
[1] https://clementbera.wordpress.com/2017/03/12/tuning-the-pharo-garbage-collector/
When Pharo images get
large
4. Object Table
• every object has an object id
• Object Table maps an object id to a data page
• data pages are the unit of disk i/o
• 1-2000 objects can fit on a data page
• an object larger than a page is broken up into
page-sized chunks
5. Object Faulting
• When an object is referenced, it’s page id is looked
up in the Object Table
• the data page is loaded into Shared Page Cache
(SPC)
• objects are copied from SPC to vm memory
• object ids converted to direct memory pointers
• stub objects in vm represent objects not yet loaded
6. Transactions
• On commit, only modified objects in vm are copied
to new pages in the SPC.
• A record of the object modifications is written to the
transaction log.
• The commit is complete when the transaction log is
successfully written to disk.
• On an abort, all modified objects in vm are converted
to stubs, as well as those changed by other sessions
7. Garbage
Collection
• GC is run as a separate (multi-threaded) operating
system process under your control.
• GC is designed to be run while the system is
actively committing.
• Schedule GC to minimize impacts on production
performance.
8. Multiple VMs
• An application can arrange to run multiple Smalltalk
vms to perform concurrent operations.
• Spread the work out over multiple CPUs and even
multiple machines.
• Concurrent commits are allowed as long as two
vms do not modify the same object during
overlapping commits.
9. Large Collections
• Excessive object faulting can occur, especially if all
of the objects in a large collection do not fit in the
object memory for a vm (not enough object
memory for your working set)
• Identity-based Collections
• Indexed Collections
10. Identity-based Collections
(IdentityBag/Set)
• The vm performs identity comparisons within
primitives by comparing object ids, instead of
sending messages, so an object fault is not
required
• #includes: (implemented as a primitive) can be
performed without faulting in the elements of the
collection
• very fast even for large collections
11. Indexed Collections
• the query result for an indexed collection is created
by copying the object ids directly from the btree
nodes into the result set without faulting in the
objects
12. Develop in Pharo Deploy in GemStone
(DiPDiG)
• Not quite formalized technique[1]
• port application to GemStone/S, adding GemStone-specific
packages to your BaselineOf if needed
• production deployed in GemStone/S installation
• ongoing development in Pharo
• With gt4gemstone[2] the door is now open for expanding the
GemStone toolset to include direct support for DiPDiG
• PharoGs (future) should reduce “porting requirement”
[1] http://forum.world.st/How-do-you-develop-for-gemstone-in-open-source-tools-pharo-td4952364.html
[2] https://github.com/feenkcom/gt4gemstone