0
Certifying (RISC) Machine
Code Safe from Aliasing
Peter T. Breuer
University of Birmingham, UK
Jonathan P. Bowen
London So...
Little and Large Problem
● Small arithmetic unit, embedded processor
– 40 bit arithmetic
● Large memory unit
– 64 bit addr...
Hardware Aliasing
● What happens to the extra wires?
– depends on the hardware
● 4 + 0xfffffffffffffffc =
0x00000000000000...
Also happens in KPU
● A KPU is an encrypted processor
– Instead of 4 - 4 = 0
– Does 99900 - 99900 = 78763298
● Homomorphis...
Problem
● How to check a program is safe from
hardware aliasing
● Where `hardware aliasing' means that
arithmetic on addre...
Can imagine in both cases ...
● Values have invisible extra bits
● 42.1101101
● Represent different encodings of '42'
● Ar...
How to deal with hardware
aliasing
● Left program returns different alias of SP to
caller
Subroutine foo:
SP -= 32 # 8 loc...
Regard machine code as
compiled from Stack Machine
control language
● Good code:
cspt GP # copy stack pointer to GP
push 3...
What makes that SM code safe?
● No access outside the current frame
– The stack access commands are
● Get 10 gp # 10th sta...
Heap access
● Deal with that later!
– Look for array and string treatment in text
Verifying SM code
● Means verifying that all stack accesses
are within the current frame boundary
● That's so easy! Check ...
Machine code looks like this
● Mov gp sp # cspt gp
Addi sp sp -32 # push 32
…
mov sp gp # rspf gp
jr ra # return
● Is it c...
To prove m/c safe
● Apply Hoare-like rules of reasoning
– Whose names are the SM code that the
m/c is supposed to be compi...
Example
● Think about a 32B current frame
{ sp=c32
!10; (10)=x }
ld gp 10(sp) [get 10 gp]
{sp=c32
!10; (10)=gp=x}
●
'c32
!...
Types
● Logic is based on stack machine model
– manipulates types in register/stack/heap
●
C32
– pointer to stack frame of...
Typing
● Milner typing
– Assign type variables to every register
and stack position within current frame
– Calculate effec...
Other Proved Things
● Termination
– Milner algorithm terminates
– With a typing, if one exists, errors if not
● Uniqueness...
Conclusion
1.Disassemble machine code
• Human activity
2.Apply Milner typing
• Includes stack machine bounds verification
...
Upcoming SlideShare
Loading in...5
×

Certifying (RISC) Machine Code Safe from Aliasing (OpenCert 2013)

59

Published on

Slide presentation for Certifying (RISC) Machine Code Safe from Aliasing, presented at OpenCert 2013, Madrid. See http://www.academia.edu/3244313/Certifying_Machine_Code_Safe_from_Hardware_Aliasing_RISC_is_not_necessarily_risky.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
59
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Certifying (RISC) Machine Code Safe from Aliasing (OpenCert 2013)"

  1. 1. Certifying (RISC) Machine Code Safe from Aliasing Peter T. Breuer University of Birmingham, UK Jonathan P. Bowen London South Bank University, UK
  2. 2. Little and Large Problem ● Small arithmetic unit, embedded processor – 40 bit arithmetic ● Large memory unit – 64 bit addressing ● What do we do with the extra wires?
  3. 3. Hardware Aliasing ● What happens to the extra wires? – depends on the hardware ● 4 + 0xfffffffffffffffc = 0x0000000000000000 or 0xfffff00000000000 ? ● Both mean 0 – If use arithmetic to calculate address 0 ● Sometimes get the 0 you want ● Sometimes not!
  4. 4. Also happens in KPU ● A KPU is an encrypted processor – Instead of 4 - 4 = 0 – Does 99900 - 99900 = 78763298 ● Homomorphism conditions on encrypted arithmetic guarantee correct behaviour – Real encryption is always 1-many ● The encoding of 0 is 9896861 ● 99900 - 99900 = 78763298  9896861 ● Another encoding of 0 is 78763298 – Encrypted arithmetic gives different result ● Depending on how you do the calculation
  5. 5. Problem ● How to check a program is safe from hardware aliasing ● Where `hardware aliasing' means that arithmetic on addresses does not always give the same result. – Trust only exactly the same calculation – Because 4 - 4 != 0 – It's `equivalent' to 0, not identical!
  6. 6. Can imagine in both cases ... ● Values have invisible extra bits ● 42.1101101 ● Represent different encodings of '42' ● Arithmetic ignores but mutates the extra bits ● 42.1101101 + 42.1100001 = 84.0110110 ● Memory unit is sensitive to invisible extra bits ● Can't see just '42'. ● Needs loving care from programmer
  7. 7. How to deal with hardware aliasing ● Left program returns different alias of SP to caller Subroutine foo: SP -= 32 # 8 local vars …code ... SP += 32 # destroy frame return Subroutine foo: GP = SP SP -= 32 …code ... SP = GP return GoodBad
  8. 8. Regard machine code as compiled from Stack Machine control language ● Good code: cspt GP # copy stack pointer to GP push 32 # make 32B space on stack … rspf GP # restore stack pointer from GP return
  9. 9. What makes that SM code safe? ● No access outside the current frame – The stack access commands are ● Get 10 gp # 10th stack cell contents.. ● Put 10 gp # .. transfer to/from reg gp – If all access offsets in current frame range ● Only one way to access stack content.. ● By offset from current stack pointer – Can only make new frame, not shift sp ● Push 32 – Can only return sp to value saved earlier ● Cspt gp … rspf gp
  10. 10. Heap access ● Deal with that later! – Look for array and string treatment in text
  11. 11. Verifying SM code ● Means verifying that all stack accesses are within the current frame boundary ● That's so easy! Check n in 'get n r'. ● But we have machine code, not SM code!
  12. 12. Machine code looks like this ● Mov gp sp # cspt gp Addi sp sp -32 # push 32 … mov sp gp # rspf gp jr ra # return ● Is it compiled from safe SM code?
  13. 13. To prove m/c safe ● Apply Hoare-like rules of reasoning – Whose names are the SM code that the m/c is supposed to be compiled from ● Requires human being to chose rule – Or an automaton to search solution space – Either way, it's deduction-guided disassembly
  14. 14. Example ● Think about a 32B current frame { sp=c32 !10; (10)=x } ld gp 10(sp) [get 10 gp] {sp=c32 !10; (10)=gp=x} ● 'c32 !10' means pointer to 32B – Already written at offset 10 ● (10)=x means stack cell 10 has an x-thing ● Machine code is 'ld gp 10(sp)' – Load reg gp from offset 10 from stack ptr ● Name of the rule is 'get 10 gp'
  15. 15. Types ● Logic is based on stack machine model – manipulates types in register/stack/heap ● C32 – pointer to stack frame of size 32 – Only access by bounded offset from ptr ● U10 – array of size 10 on heap – Can only access by offset from fixed base ● C1 - string accessed in increments of 1 – String is like a stack of frames size 1 – Stepping up `pops one off the stack' – Access within `current frame' only
  16. 16. Typing ● Milner typing – Assign type variables to every register and stack position within current frame – Calculate effect of instructions – Ambiguous modulo assignment of rule ● Equals dis-assembly of instruction ● Proved – soundness – Assigned types say what really happens
  17. 17. Other Proved Things ● Termination – Milner algorithm terminates – With a typing, if one exists, errors if not ● Uniqueness – The type found is unique most general ● For a given annotation ● There are at most 32 valid annotations – Differ in position of stack pointer register
  18. 18. Conclusion 1.Disassemble machine code • Human activity 2.Apply Milner typing • Includes stack machine bounds verification • Automated activity 3.Certify m/c as hardware alias safe ● Steps 1 & 2 can be mixed/simultaneous ● Inference-guided disassembly 4.Apply to assembler in Linux kernel
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×