the diary of a datum the diary of a datum
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

the diary of a datum the diary of a datum

on

  • 432 views

 

Statistics

Views

Total Views
432
Views on SlideShare
432
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

the diary of a datum the diary of a datum Presentation Transcript

  • 1. the diary of a datum (modeling runtime behavior in framework-based applications) Nick Mitchell, Gary Sevitsky, Harini Srinivasan IBM T.J. Watson Research Center Hawthorne, NY USA July 7, 2006
  • 2. the diary of a timecard store in MQ extract XML business serialized copy (and DB2 DB2 parse serialize DB2 message content document object Java object repackage) blob record record cost of parse step: - 2000 calls - 300 objects ● from a deployed server application – extensively uses frameworks (large-scale applications act as integrators) – this is par for the course for rich clients, too ● that's a lot of work for such a simple task! – what makes this silly is the large number of data transformations
  • 3. hotspot-free bloat store in MQ extract XML business serialized copy (and DB2 DB2 parse serialize DB2 message content document object Java object repackage) blob record record cost of parse step: - 2000 calls - 300 objects ● tuning the easy stuff may remove hot spots – bye bye bubble sort! ● but bloat remains – the runtime complexity and cost of hotspot-free programs is high
  • 4. the diary of a datum methodology ● the analysis scenario – structure activity based on information flow ● code-agnostic classification – label forms of data, and transformations between them, according to the purpose they serve in the scenario ● behavior signatures – quantify runtime complexity and costs of a scenario in code-agnostic terms
  • 5. part I: the analysis scenario structuring computation as flows of the physical forms of logical content
  • 6. information content ● various physical representations – binary (MQ message) ● each at the same – unicode XML (SOAP) level of granularity – heap object (Java) – record (e.g. timecard) – binary (serialized Java object) – field (e.g. dates, serial number, manager) – binary (DB2 blob) – subfield (e.g. month or day of a date) ● one a logical collection of information – timecard
  • 7. an example analysis scenario Copy to Parse, another bytes set field Calendar* version of Date* (SOAP) in (Java field) the (Java field) business business object object Cost: - 268 calls *new objects - 70 objects analysis scenario code: Trade benchmark v.3.1 (acting as a SOAP client) logical content: stock purchase date (field granularity) source: subsequence of a SOAP message (as bytes) sink: a field in a Java object (used for subsequent HTML rendering)
  • 8. the diary of a date Copy to Parse, another bytes set field Calendar* version of Date* (SOAP) in (Java field) (Java field) the business business object object Cost: - 268 calls *new objects - 70 objects analysis scenario code: Trade benchmark v.3.1 (acting as a SOAP client) logical content: stock purchase date (field granularity) source: subsequence of a SOAP message (as bytes) sink: a field in a Java object (used for subsequent HTML rendering)
  • 9. zooming in on the SOAP parsing (detail of just the first step of the previous slide) Parse (using SOAP CalendarDeserializer) Set business object field via reflection Cost: - 6 calls Calendar* - 1 object build Calendar + 11 arrays* Cost: + TimeZone* - 15 calls - 15 objects 2 longs (TZ and millis) SimpleDateFormat + Calendar ParsePosition* parse time zone and parse using extract add in call bytes millis; Simple- box into value from String* String* Date* timezone Date set time Calendar Object[]* invoke() Calendar (SOAP) reformat Date- array SOAP tag and millis on without TimeZone* Format setter them (constant) Cost: Cost: Cost: Cost: Cost: - 30 calls - 11 calls - 95 calls - 4 calls - 7 calls - 3 objects - 6 objects - 39 objects - 0 objects - 1 object XML and Java types get de- Deserializer* serializer get Cost: Cost: *new objects schema - 10 calls - 51 calls info - 0 objects - 5 objects BeanPropertyDescriptor
  • 10. zooming in on SimpleDateFormat.parse extract and parse set field in int Calendar Cost: Cost: subfield - 4 calls - 0 calls - 1 object - 1 object Cost: Cost: - 14 calls - 1 calls create - 6 objects - 0 objects compute Date long Date* String x 6 for Calendar time from YY, time MM, boolean[]* DD, ... *new objects
  • 11. zooming in on the subfield parsing (called for every month, day, year, ...) Parse number using DecimalFormat.parse() Cost: *new objects - 11 calls Parse long using DigitList.getLong() Cost: - 5 objects - 4 calls - 3 objects Cost: - 600 instructions - 1 call - 0 objects Parse- Position* extract Digit- String copy digits String- toString() String* parse long box Long* intValue() int digits List Buffer* boolean[]* Physical change Phenomenon Structure copy Structure copy Rewrap Convert Box Unbox Fundamental properties Copy: Copy Copy - Copy Copy Copy Bit change: - - - Bit change - - Type change: Type change Type change Type change Type change Type change Type change Id change: Id change Id change Id change Id change Id change Id change Create: - Create Create - Create -
  • 12. part II: the purpose of objects and transformations a multi-dimensional labeling of diagrams
  • 13. transformations as switches
  • 14. data immediately used for... input data output data faciltators
  • 15. only and eventually used for... faciltator
  • 16. only and eventually used for... faciltator faciltator-related activity
  • 17. the purpose of a datum (a taxonomy)
  • 18. the purpose of a datum (a taxonomy) the forms of the information produced by the analysis scenario
  • 19. the purpose of a datum (a taxonomy) e.g. SimpleDateFormat, ByteToCharConverter, XML deserializers
  • 20. example: classifying data by immediate purpose Parse (using SOAP CalendarDeserializer) Set business object field via reflection Cost: - 6 calls Converter - 1 object build Calendar* Calendar + 11 arrays* Cost: + TimeZone* - 15 calls - 15 objects 2 longs (TZ and millis) Converter SimpleDateFormat + Calendar Cursor ParsePosition* parse time Carrier Carrier zone and Carrier parse using Carrier Carrier Carrier Carrier Carrier extract add in call bytes String* millis; Simple- box into value from String* Date* timezone Date set time Calendar Object[]* invoke() Calendar (SOAP) reformat Date- array SOAP tag and millis on without TimeZone* Format setter them (constant) Cost: Cost: Cost: Cost: Cost: - 30 calls - 11 calls - 95 calls - 4 calls - 7 calls - 3 objects - 6 objects - 39 objects - 0 objects - 1 object Carrier Converter XML and Java types get de- Deserializer* serializer get Cost: Cost: *new objects schema - 10 calls - 51 calls info - 0 objects - 5 objects Schema BeanPropertyDescriptor
  • 21. how transformations alter carrier data input data output data
  • 22. how transformations alter carrier data ● physical changes ● logical changes – copy – instance – bit change – value – type change – granularity – id change – new object created
  • 23. how transformations alter carrier data ● physical changes ● logical changes – copy – instance – bit change – value – type change – granularity – id change – new object created StringBuffer String
  • 24. how transformations alter carrier data ● physical changes ● logical changes – copy an instance of – instance – bit change a rewrap idiom – value – type change of – granularity change physical – id change – new object created StringBuffer String
  • 25. how transformations alter carrier data ● physical changes ● logical changes an copy – instance of an – instance information-preserving – bit change – value idiom typelogical change – of change – granularity – id change – new object created StringBuffer String
  • 26. how transformations alter carrier data ● physical changes ● logical changes – copy – instance – bit change – value – type change – granularity – id change – new object created byte[] char[] ByteToCharConverter
  • 27. how transformations alter carrier data ● physical changes ● logical changes – copy an instance of – instance – bit change a–convert idiom value – type change of – granularity change physical – id change – new object created byte[] char[] ByteToCharConverter
  • 28. how transformations alter carrier data ● physical changes ● logical changes – copy – instance – bit change – value – type change – granularity – id change – new object created add price sales tax price New York sales tax
  • 29. labeling transformations according to physical change Parse number using DecimalFormat.parse() Cost: *new objects Parse long using DigitList.getLong() - 11 calls Cost: - 5 objects - 4 calls - 3 objects Cost: - 600 instructions - 1 call - 0 objects Parse- Position* extract Digit- String copy digits String- toString() String* parse long box Long* intValue() int digits List Buffer* boolean[]* Physical change Phenomenon Structure copy Structure copy Rewrap Convert Box Unbox Fundamental properties Copy: Copy Copy - Copy Copy Copy Bit change: - - - Bit change - - Type change: Type change Type change Type change Type change Type change Type change Id change: Id change Id change Id change Id change Id change Id change Create: - Create Create - Create -
  • 30. part III: behavior signatures counting transformations by purpose to quantify the complexity and costs of an analysis scenario
  • 31. even the simple things are hard to get right ● three versions of StringBuffer.append(int) JDK bit information carriers version changes copies exchanges created pre-1.4.2 1 2 0 1 1.4.2 1 2 1 0 1.5.0 1 1 0 0
  • 32. even the simple things are hard to get right ● three versions of StringBuffer.append(int) JDK bit information carriers version changes copies exchanges created pre-1.4.2 1 2 0 1 1.4.2 1 2 1 0 1.5.0 1 1 0 0 two “behavior signatures” of physical change
  • 33. even the simple things are hard to get right ● three versions of StringBuffer.append(int) JDK bit information carriers version changes copies exchanges created pre-1.4.2 1 2 0 1 1.4.2 1 2 1 0 1.5.0 1 1 0 0 one “behavior signature” of logical change
  • 34. even the simple things are hard to get right ● three versions of StringBuffer.append(int) JDK bit information carriers version changes copies exchanges created pre-1.4.2 1 2 0 1 1.4.2 1 2 1 0 1.5.0 1 1 0 0 one “behavior signature” of object purpose
  • 35. benchmark versus application (analyzed by idioms of physical change) ● analysis scenario: Date field structure box/unbox copy rewrap convert benchmark 0 0 1 1 app1 3 0 0 1 app2 0 2 2 4 ● analysis scenario: BigDecimal field structure box/unbox copy rewrap convert benchmark 0 0 1 1 app2 4 1 1 5
  • 36. a language for discourse about runtime complexity ● evaluating benchmarks ● choosing a framework implementation to use ● API design and implementation ● identify new compiler optimizations ● inspire adoption of good language constructs – warning: two years later, we'll be back where we started :)