S.Ducasse 1
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Marcus Denker
denker@iam.unibe...
S.Ducasse 2
License: CC-Attribution-ShareAlike 2.0
http://creativecommons.org/licenses/by-sa/2.0/
S.Ducasse 3
Reasons for working with Bytecode
• Generating Bytecode
• Implementing compilers for other languages
• Experim...
S.Ducasse 4
Overview
• Squeak Bytecodes
• Examples: Generating Bytecode
• with IRBuilder
• Decoding Bytecodes
• Bytecode E...
S.Ducasse 5
The SqueakVirtual Machine
• Virtual machine provides a virtual processor
• Bytecode:The ‘machinecode’ of the v...
S.Ducasse 6
Bytecode in the CompiledMethod
• CompiledMethods format:
Header
Literals
Bytecode
Trailer
Pointer to
Source
Ar...
S.Ducasse 7
Bytecodes: Single or multibyte
• Different forms of bytecodes:
• Single bytecodes:
• Example: 120: push self
•...
S.Ducasse 8
Example: Number>>asInteger
• Smalltalk code:
• Symbolic Bytecode
Number>>asInteger
"Answer an Integer nearest ...
S.Ducasse 9
Example: Step by Step
• 9 <70> self
• The receiver (self) is pushed on the stack
• 10 <D0> send: truncated
• B...
S.Ducasse 10
Squeak Bytecodes
• 256 Bytecodes, four groups:
• Stack Bytecodes
• Stack manipulation: push / pop / dup
• Sen...
S.Ducasse 11
Stack Bytecodes
• push and store, e.g. temps, instVars, literals
• e.g: 16 - 31: push instance variable
• Pus...
S.Ducasse 12
Sends and Returns
• Sends: receiver is on top of stack
• Normal send
• Super Sends
• Hard-coded sends for eff...
S.Ducasse 13
Jump Bytecodes
• Control Flow inside one method
• Used to implement control-flow efficiently
• Example:
9 <76...
S.Ducasse 14
What you should have learned...
• ... dealing with bytecodes directly is possible, but very
boring.
• We want...
S.Ducasse 15
Generating Bytecodes
• IRBuilder
• Part of the new compiler (on SqueakMap)
• Like an Assembler for Squeak
S.Ducasse 16
IRBuilder: Simple Example
• Number>>asInteger
iRMethod := IRBuilder new
rargs: #(self); "receiver and args"
p...
S.Ducasse 17
IRBuilder: Stack Manipulation
• popTop - remove the top of stack
• pushDup - push top of stack on the stack
•...
S.Ducasse 18
IRBuilder: Symbolic Jumps
• Jump targets are resolved:
• Example: false ifTrue: [’true’] ifFalse: [’false’]
i...
S.Ducasse 19
IRBuiler: InstanceVariables
• Access by offset
• Read: getField:
• receiver on top of stack
• Write: setField...
S.Ducasse 20
IRBuilder:TemporaryVariables
• Accessed by name
• Define with addTemp: / addTemps:
• Read with pushTemp:
• Wr...
S.Ducasse 21
IRBuilder: Sends
• normal send
• super send
• The second parameter specifies the class were the
lookup starts...
S.Ducasse 22
IRBuilder: Lessons learned
• IRBuilder: Easy bytecode generation
• Not part of Squeak yet
• Will be added to ...
S.Ducasse 23
Parsing and Interpretation
• First step: Parse bytecode
• enough for easy analysis, pretty printing, decompil...
S.Ducasse 24
The InstructionStream Hierarchy
InstructionStream
ContextPart
BlockContext
MethodContext
Decompiler
Instructi...
S.Ducasse 25
InstructionStream
• Parses the byte-encoded instructions
• State:
• pc: programm counter
• sender: the method...
S.Ducasse 26
Usage
• Generate an instance:
• Now we can step through the bytecode with
• calls methods on a client object ...
S.Ducasse 27
InstructionClient
• Abstract superclass
• Defines empty methods for all methods that
InstructionStream calls ...
S.Ducasse 28
Example:A test
InstructionClientTest>>testInstructions
"just interpret all of methods of Object"
| methods cl...
S.Ducasse 29
Example: Printing Bytecodes
• InstructionPrinter: Print the bytecodes as human
readable text
• Example: print...
S.Ducasse 30
InstructionPrinter
• Class Definition:
InstructionClient subclass: #InstructionPrinter
instanceVariableNames:...
S.Ducasse 31
InstructionPrinter
• Main Loop:
InstructionPrinter>>printInstructionsOn: aStream
"Append to the stream, aStre...
S.Ducasse 32
InstructionPrinter
• Overwrites methods form InstructionClient to print
the bytecodes as text
• e.g. the meth...
S.Ducasse 33
Example: InstVarRefLocator
InstructionClient subclass: #InstVarRefLocator
instanceVariableNames: 'bingo'
.......
S.Ducasse 34
InstVarRefLocator
• CompiledMethod>>#hasInstVarRef
hasInstVarRef
"Answer whether the receiver references an i...
S.Ducasse 35
InstVarRefLocator
• Example for a simple bytecode analyzer
• Usage:
aMethod hasInstVarRef
(TestCase>>#debug) ...
S.Ducasse 36
Example: Decompiling to IR
• BytecodeDecompiler
• Usage:
InstructionStream subclass: #BytecodeDecompiler
inst...
S.Ducasse 37
Decompiling
• uses IRBuilder for building IR
• e.g. code for the bytecode pushReceiver:
BytecodeDecompiler>>p...
S.Ducasse 38
ContextPart: Execution Semantics
• Sometimes we need more than parsing
• “stepping” in the debugger
• system ...
S.Ducasse 39
Simulation
• Provides a complete Bytecode interpreter
• Run a block with the simulator:
(ContextPart runSimul...
S.Ducasse 40
Profiling: MesageTally
• Usage:
• Other example:
MessageTally tallySends: [3 factorial]
This simulation took ...
S.Ducasse 41
Upcoming SlideShare
Loading in...5
×

Stoop 390-instruction stream

399

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
399
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Stoop 390-instruction stream

  1. 1. S.Ducasse 1 QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture. Marcus Denker denker@iam.unibe.ch Working with Bytecodes: IRBuilder and InstructionStream
  2. 2. S.Ducasse 2 License: CC-Attribution-ShareAlike 2.0 http://creativecommons.org/licenses/by-sa/2.0/
  3. 3. S.Ducasse 3 Reasons for working with Bytecode • Generating Bytecode • Implementing compilers for other languages • Experimentation with new language features • Parsing and Interpretation: • Analysis • Decompilation (for systems without source) • Pretty printing • Interpretation: Debugger, Profiler
  4. 4. S.Ducasse 4 Overview • Squeak Bytecodes • Examples: Generating Bytecode • with IRBuilder • Decoding Bytecodes • Bytecode Execution
  5. 5. S.Ducasse 5 The SqueakVirtual Machine • Virtual machine provides a virtual processor • Bytecode:The ‘machinecode’ of the virtual machine • Smalltalk (like Java): Stack machine • easy to implement interpreters for different processors • most hardware processors are register machines • SqueakVM: Implemented in Slang • Slang: Subset of Smalltalk. (”C with Smalltalk Syntax”) • Translated to C
  6. 6. S.Ducasse 6 Bytecode in the CompiledMethod • CompiledMethods format: Header Literals Bytecode Trailer Pointer to Source Array of all Literal Objects Number of temps, literals... (Number>>#asInteger)inspect
  7. 7. S.Ducasse 7 Bytecodes: Single or multibyte • Different forms of bytecodes: • Single bytecodes: • Example: 120: push self • Groups of similar bytecodes • 16: push temp 1 • 17: push temp 2 • up to 31 • Multibyte bytecodes • Problem: 4bit offset may be too small • Solution: Use the following byte as offset • Example: Jumps need to encode large jump offsets OffsetType 4bits 4bits
  8. 8. S.Ducasse 8 Example: Number>>asInteger • Smalltalk code: • Symbolic Bytecode Number>>asInteger "Answer an Integer nearest the receiver toward zero." ^self truncated 9 <70> self 10 <D0> send: truncated 11 <7C> returnTop
  9. 9. S.Ducasse 9 Example: Step by Step • 9 <70> self • The receiver (self) is pushed on the stack • 10 <D0> send: truncated • Bytecode 208: send litereral selector 1 • Get the selector from the first literal • start message lookup in the class of the object that is top of the stack • result is pushed on the stack • 11 <7C> returnTop • return the object on top of the stack to the calling method
  10. 10. S.Ducasse 10 Squeak Bytecodes • 256 Bytecodes, four groups: • Stack Bytecodes • Stack manipulation: push / pop / dup • Send Bytecodes • Invoke Methods • Return Bytecodes • Return to caller • Jump Bytecodes • Control flow inside a method
  11. 11. S.Ducasse 11 Stack Bytecodes • push and store, e.g. temps, instVars, literals • e.g: 16 - 31: push instance variable • Push Constants (False/True/Nil/1/0/2/-1) • Push self, thisContext • duplicate top of stack • pop
  12. 12. S.Ducasse 12 Sends and Returns • Sends: receiver is on top of stack • Normal send • Super Sends • Hard-coded sends for efficiency, e.g. +, - • Returns • Return top of stack to the sender • Return from a block • Special bytecodes for return self, nil, true, false (for efficiency)
  13. 13. S.Ducasse 13 Jump Bytecodes • Control Flow inside one method • Used to implement control-flow efficiently • Example: 9 <76> pushConstant: 1 10 <77> pushConstant: 2 11 <B2> send: < 12 <99> jumpFalse: 15 13 <20> pushConstant: 'true' 14 <90> jumpTo: 16 15 <73> pushConstant: nil 16 <7C> returnTop ^ 1<2 ifTrue: ['true']
  14. 14. S.Ducasse 14 What you should have learned... • ... dealing with bytecodes directly is possible, but very boring. • We want reusable abstractions that hide the details (e.g. the different send bytecodes) • We would like to have frameworks for • Generating bytecode easily • Parsing bytecode • Evaluating bytecode
  15. 15. S.Ducasse 15 Generating Bytecodes • IRBuilder • Part of the new compiler (on SqueakMap) • Like an Assembler for Squeak
  16. 16. S.Ducasse 16 IRBuilder: Simple Example • Number>>asInteger iRMethod := IRBuilder new rargs: #(self); "receiver and args" pushTemp: #self; send: #truncated; returnTop; ir. aCompiledMethod := iRMethod compiledMethod. aCompiledMethod valueWithReceiver:3.5 arguments: #()
  17. 17. S.Ducasse 17 IRBuilder: Stack Manipulation • popTop - remove the top of stack • pushDup - push top of stack on the stack • pushLiteral: • pushReceiver - push self • pushThisContext
  18. 18. S.Ducasse 18 IRBuilder: Symbolic Jumps • Jump targets are resolved: • Example: false ifTrue: [’true’] ifFalse: [’false’] iRMethod := IRBuilder new rargs: #(self); "receiver and args" pushLiteral: false; jumpAheadTo: #false if: false; pushLiteral: 'true'; "ifTrue: ['true']" jumpAheadTo: #end; jumpAheadTarget: #false; pushLiteral: 'false'; "ifFalse: ['false']" jumpAheadTarget: #end; returnTop; ir.
  19. 19. S.Ducasse 19 IRBuiler: InstanceVariables • Access by offset • Read: getField: • receiver on top of stack • Write: setField: • receiver and value on stack • Example: set the first instance variable to 2 iRMethod := IRBuilder new rargs: #(self); "receiver and args declarations" pushLiteral: 2; pushTemp: #self; setField: 1; pushTemp: #self; returnTop; ir. aCompiledMethod := iRMethod compiledMethod. aCompiledMethod valueWithReceiver: 1@2 arguments: #()
  20. 20. S.Ducasse 20 IRBuilder:TemporaryVariables • Accessed by name • Define with addTemp: / addTemps: • Read with pushTemp: • Write with storeTemp: • Examle: set variables a and b, return value of a iRMethod := IRBuilder new rargs: #(self); "receiver and args" addTemps: #(a b); pushLiteral: 1; storeTemp: #a; pushLiteral: 2; storeTemp: #b; pushTemp: #a; returnTop; ir.
  21. 21. S.Ducasse 21 IRBuilder: Sends • normal send • super send • The second parameter specifies the class were the lookup starts. .... builder send: #selector toSuperOf: aClass; builder pushLiteral: ‘hello’ builder send: #size;
  22. 22. S.Ducasse 22 IRBuilder: Lessons learned • IRBuilder: Easy bytecode generation • Not part of Squeak yet • Will be added to Squeak 3.9 or 4.0 • Next: Decoding bytecode
  23. 23. S.Ducasse 23 Parsing and Interpretation • First step: Parse bytecode • enough for easy analysis, pretty printing, decompilation • Second step: Interpretation • needed for simulation, complex analyis (e.g., profiling) • Squeak provides frameworks for both: • InstructionStream/InstructionClient (parsing) • ContextPart (Interpretation)
  24. 24. S.Ducasse 24 The InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter InstVarRefLocator BytecodeDecompiler
  25. 25. S.Ducasse 25 InstructionStream • Parses the byte-encoded instructions • State: • pc: programm counter • sender: the method (bad name!) Object subclass: #InstructionStream instanceVariableNames: 'sender pc' classVariableNames: 'SpecialConstants' poolDictionaries: '' category: 'Kernel-Methods'
  26. 26. S.Ducasse 26 Usage • Generate an instance: • Now we can step through the bytecode with • calls methods on a client object for the type of bytecode, e.g. • pushReceiver • pushConstant: value • pushReceiverVariable: offset instrStream := IntructionStream on: aMethod instrStream interpretNextInstructionFor: client
  27. 27. S.Ducasse 27 InstructionClient • Abstract superclass • Defines empty methods for all methods that InstructionStream calls on a client • For convienience: • Client don’t need to inherit from this Object subclass: #InstructionClient instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Kernel-Methods'
  28. 28. S.Ducasse 28 Example:A test InstructionClientTest>>testInstructions "just interpret all of methods of Object" | methods client scanner| methods := Object methodDict values. client := InstructionClient new. methods do: [:method | scanner := (InstructionStream on: method). [scanner pc <= method endPC] whileTrue: [ self shouldnt: [scanner interpretNextInstructionFor: client] raise: Error. ]. ].
  29. 29. S.Ducasse 29 Example: Printing Bytecodes • InstructionPrinter: Print the bytecodes as human readable text • Example: print the bytecode of Number>>asInteger: String streamContents: [:str | (InstructionPrinter on: Number>>#asInteger) printInstructionsOn: str ] result: '9 <70> self 10 <D0> send: truncated 11 <7C> returnTop '
  30. 30. S.Ducasse 30 InstructionPrinter • Class Definition: InstructionClient subclass: #InstructionPrinter instanceVariableNames: 'method scanner stream oldPC indent' classVariableNames: '' poolDictionaries: '' category: 'Kernel-Methods'
  31. 31. S.Ducasse 31 InstructionPrinter • Main Loop: InstructionPrinter>>printInstructionsOn: aStream "Append to the stream, aStream, a description of each bytecode in the instruction stream." | end | stream := aStream. scanner := InstructionStream on: method. end := method endPC. oldPC := scanner pc. [scanner pc <= end] whileTrue: [scanner interpretNextInstructionFor: self]
  32. 32. S.Ducasse 32 InstructionPrinter • Overwrites methods form InstructionClient to print the bytecodes as text • e.g. the method for pushReceiver: InstructionPrinter>>pushReceiver "Print the Push Active Context's Receiver on Top Of Stack bytecode." self print: 'self'
  33. 33. S.Ducasse 33 Example: InstVarRefLocator InstructionClient subclass: #InstVarRefLocator instanceVariableNames: 'bingo' .... interpretNextInstructionUsing: aScanner bingo := false. aScanner interpretNextInstructionFor: self. ^bingo popIntoReceiverVariable: offset bingo := true pushReceiverVariable: offset bingo := true storeIntoReceiverVariable: offset bingo := true
  34. 34. S.Ducasse 34 InstVarRefLocator • CompiledMethod>>#hasInstVarRef hasInstVarRef "Answer whether the receiver references an instance variable." | scanner end printer | scanner := InstructionStream on: self. printer := InstVarRefLocator new. end := self endPC. [scanner pc <= end] whileTrue: [ (printer interpretNextInstructionUsing: scanner) ifTrue: [^true]. ]. ^false
  35. 35. S.Ducasse 35 InstVarRefLocator • Example for a simple bytecode analyzer • Usage: aMethod hasInstVarRef (TestCase>>#debug) hasInstVarRef true (Integer>>#+) hasInstVarRef false
  36. 36. S.Ducasse 36 Example: Decompiling to IR • BytecodeDecompiler • Usage: InstructionStream subclass: #BytecodeDecompiler instanceVariableNames: 'irBuilder blockFlag' classVariableNames: '' poolDictionaries: '' category: 'Compiler-Bytecodes' BytecodeDecompiler new decompile: (Number>>#asInteger)
  37. 37. S.Ducasse 37 Decompiling • uses IRBuilder for building IR • e.g. code for the bytecode pushReceiver: BytecodeDecompiler>>pushReceiver irBuilder pushReceiver
  38. 38. S.Ducasse 38 ContextPart: Execution Semantics • Sometimes we need more than parsing • “stepping” in the debugger • system simulation for profiling InstructionStream subclass: #ContextPart instanceVariableNames: 'stackp' classVariableNames: 'PrimitiveFailToken QuickStep' poolDictionaries: '' category: 'Kernel-Methods'
  39. 39. S.Ducasse 39 Simulation • Provides a complete Bytecode interpreter • Run a block with the simulator: (ContextPart runSimulated: [3 factorial])
  40. 40. S.Ducasse 40 Profiling: MesageTally • Usage: • Other example: MessageTally tallySends: [3 factorial] This simulation took 0.0 seconds. **Tree** 1 SmallInteger(Integer)>>factorial 1 SmallInteger(Integer)>>factorial 1 SmallInteger(Integer)>>factorial 1 SmallInteger(Integer)>>factorial MessageTally tallySends: [’3’ + 1] **Leaves** 4 SmallInteger(Integer)>>factorial 3 SmallInteger(Integer)>>factorial 1 BlockContext>>DoIt
  41. 41. S.Ducasse 41
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×