A translator for ECMA-335 CIL/MSIL to C language.
How works and aiming for tiny resource requirements by the IL2C?
How works AOT (ahead of time compilation) by the IL2C?
What’s done, doing and will do the IL2C project?
2. Kouji Matsui – kozy, kekyo
• NAGOYA city, AICHI pref., JP
• Twitter – @kozy_kekyo, @kekyo2 /
Facebook
• Self employed (I’m looking for a job)
• Microsoft Most Valuable Professional VS
and DevTech 2015-
• Certified Scrum master / Scrum product
owner
• Center CLR organizer.
• .NET/C#/F#/IL/metaprogramming or like…
• Bike rider
3. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
4. Abstract
WARNING:
◦ This session contains very complex technology topics reached for LEVEL 600
or above.
◦ Are you ready? ;)
How works and aiming for tiny resource requirements by the IL2C?
How works AOT (ahead of time compilation) by the IL2C?
What’s done, doing and will do the IL2C project?
5. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
6. What’s the IL2C
A translator for
ECMA-335 CIL/MSIL to C language.
F# Code
IL2C
Target native
binaryC# Code
8. What’s the IL2C
IL2C's implementation priorities, we're aiming for:
◦ Better predictability for runtime costs, better human readability for the IL2C
translated C source code.
◦ Very tiny footprint requirements, we are thinking about how fit between tiny
embedded system and large system with many resources.
◦ Better code/runtime portability, minimum requirements are only C99
compiler.
◦ Better interoperabilities for exist C libraries, we can use standard .NET interop
technics (likely P/Invoke.)
◦ Contains seamless building system for major C toolkits, for example: CMake
system, Arduino IDE, VC++ ...
9. What’s the IL2C
IL2C
Assembly
(*.dll)
C Language
(*.c, *.h)
C Language
(*.c, *.h)Target dev C language compiler
(VS, Arduino, mbed and etc…)
Target native
binary
C# Code C# Compiler
(Roslyn)
F# Compiler
(FCS)F# Code
Another compiler
10. C language source code
CIL / MSIL ECMA-335 specific binaries
What’s the IL2C
IL2C
Assembly
(*.dll)
C Language
(*.c, *.h)
C Language
(*.c, *.h)Target dev C language compiler
(VS, Arduino, mbed and etc…)
Target native
binary
C# Code C# Compiler
(Roslyn)
F# Compiler
(FCS)F# Code
11. Building schemes (aiming for)
IL2C
Assembly
(*.dll)
C Language
(*.c, *.h)
C Language
(*.c, *.h)
Target dev C language compiler
(VS, Arduino, mbed and etc…)
Target native
binary
C# Code
C# Compiler
(Roslyn)
NuGet
(*.dll)
C Language
(*.c, *.h)
IL2C Runtime
(*.c, *.h)
Prebuilt
Libraries
3rd party
Libraries
12. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
13. The runtime types – primitive and string
typedef aliases
◦ stdint.h, stdbool.h, wchar_t, float.h
byte short int long sbyte ushort uint ulong
uint8_t int16_t int32_t int64_t int8_t uint16_t uint32_t uint64_t
float double bool char IntPtr UIntPtr
float double bool wchar_t intptr_t uintptr_t
14. The runtime types – primitive and string
System.String – variable strage space
System_String_VTable
System_String (heap)
vptr0__
string_body__
“ABCDEFGHIJ0”
15. The runtime types – primitive and string
Constant literal string
System_String_VTABLE
System_String (.rdata)
vptr0__
string_body__
const wchar_t[] (.rdata)
“ABCDEFGHIJ0”
16. IL2C_REF_HEADER (.rdata)
The runtime types – primitive and string
Constant literal string
System_String
vptr0__
string_body__
const wchar_t[] (.rdata)
“ABCDEFGHIJ0”
pNext
type
gcMark
VTABLE
17. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
18. How works the garbage collector
Basic strategy – mark and sweep algorithm (NOT include compaction)
Root Root Root
1. Clear mark
2. Set mark 3. Free unmarked
19. How works the garbage collector: Phase 1
IL2C_REF_HEADER (heap)
pNext
type
gcMark
IL2C_REF_HEADER (heap)
pNext
g_pBeginHeader
GCMARK_NOMARK
GCMARK_NOMARK
20. How works the garbage collector: Phase 2
IL2C_REF_HEADER (heap)
pNext
type
gcMark
g_pBeginFrame
GCMARK_LIVE
GCMARK_LIVE
GCMARK_LIVE
System_String
vptr0__
string_body__
“ABCDEFGHIJ0”
21. How works the garbage collector: Phase 2
Function1()
Function2()
Function3()
local1
local2
local1
local2
local3
g_pBeginFrame
local2
GCMARK_LIVE
null
null
null
Step2 details
local1 null
2
22. How works the garbage collector: Phase 3
IL2C_REF_HEADER (heap)
pNext
type
gcMark
IL2C_REF_HEADER (heap)
pNext
g_pBeginHeader
GCMARK_LIVE
GCMARK_NOMARK
GCMARK_NOMARK
GCMARK_NOMARK
GCMARK_LIVE
23. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
24. IL2C_REF_HEADER (heap)
System_ValueType
The value type / boxing
pNext
type
gcMark
vptr0__
System_Int32System_Int32 Boxed
32bit (4bytes)
Bit exactly
sizeof(IL2C_REF_HEADER) +
sizeof(System_ValueType) +
sizeof(System_Int32)
28. The value type / boxing
Valid?
The value type is restricted for marking
“virtual” by Roslyn.
Therefore we can force virtualize by
assigning the interface.
Will boxed for call
Q2:
29. The value type / boxing
Valid?
Mutable usage by the
virtual method
…Boxed?
Q3:
30. The value type / boxing : Q1
System_Int32 (System_Int32*)
System_Int32_ToString(
System_Int32* this)
1. Got pointer
2. Arg0 (this)
Copy-free invoking
31. IL2C_REF_HEADER (heap)
System_ValueType
The value type / boxing : Q3
pNext
type
gcMark
vptr0__
System_Int32
System_Int32_VTABLE (.rdata)
offset__
System_Int32_Equals(…)
System_Object_Finalize(…)
System_Int32_GetHashCode(…)
System_Int32_ToString(…)
System_Int32_ToString(
System_Int32* this)
Where’s unboxing?
32. The value type / boxing : Q3
System_Int32_VTABLE (.rdata)
offset__
System_Int32_Equals_VFunc(…)
System_Object_Finalize(…)
System_Int32_GetHashCode_VFunc(…)
System_Int32_ToString_VFunc(…)
System_Int32_ToString_VFunc(
System_ValueType* this)
System_Int32_ToString(
System_Int32* this)
Unboxing
(And not copy)
We can (have to) manipulate
the instance fields
33. The value type / boxing : Q2
System_Int32_IFoo_VTABLE (.rdata)
offset__
System_Int32_ToString_VFunc(…)
System_Int32_ToString_VFunc(
System_ValueType* this)
System_Int32_ToString(
System_Int32* this)
Unboxing
(And not copy)
We can manipulate the instance
fields, but gonna discard
34. The value type / boxing
My strategy was:
◦ Q1: The value type methods can access their fields by using unboxed-raw-
pointer. (Copy-free access)
→
◦ Q2: The interface implementation way can same procedure
→ We have to copy the instance…
◦ Q3: The value type virtual methods same procedure…
→
Will fix this problem…
35. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
36. The enum types
Q: How implement the enum types will you use the C language?
translate
C# (IL) C
37. The enum types
Q: How implement the enum types will you use the C language?
translate
C# (IL) C
It has 2 problems
38. The enum types
A1: The enum symbol names are globally. We have to able to apply
different names each enum types…
C
Combined namespace
Combined namespace and
each value symbol
39. The enum types
A2: In the C language, enum types can’t annotate the storage space.
C
consoleapplication1.c(3):
error C2059: syntax error: ':'
In the C++, we can
use this syntax
40. The enum types
Final results for IL2C way:
C The IL2C has to calculate real
value at each symbols…
41. The enum types
Final results for IL2C way:
◦ Each enum types NOT derived from System.Enum type.
◦ We can implicitly convert by bidirection both enum types and integer types.
Convert from Int32 to
Int32EnumType implicitly.
C#
IL
43. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
44. The delegate types
The delegate types almost always inherited from System.MulticastDelegate.
◦ System.Delegate type has two member for “Method” and “Target.”
◦ The derived delegate type has the type-safe method named “Invoke.”
System.Object
System.Delegate
System.MulticastDelegate
System.EventHandler
void Invoke(
object sender, EventArgs e)
MethodInfo Method;
object Target;
45. The delegate types
“Method” member is the information for callee.
◦ If callee method is static (not instance method), the “Target” is null.
private static void Form_Clicked(
object sender, EventArgs e)
static method
callee method
System.Delegate
Target
Method
Target (instance)
is nothing
null
46. class Bar.Form
The delegate types
◦ If callee method is instance (not static method), the “Target” refer to instance.
System.Delegate private void Form_Clicked(
object sender, EventArgs e)
Target
Method
instance method
Target (instance)
callee method
47. The delegate types
The IL2C way puts the code with instance detection expression.
System_EventHandler_Invoke(…)
C
C#
Detect the “Target” is not NULL
48. The delegate types
The instance method into value type can assign to delegate and valid.
Refer to:
public override string System.Int32.ToString()
C#
49. The delegate types
The instance method into value type can assign to delegate and valid.
boxing
Resolve for
virtual method pointer
IL
50. IL2C_REF_HEADER (heap)
The delegate types
The boxing opcode is very important in this case. Because the “Target”
member type is System.Object, so can’t store the native value or
managed pointer.
value type System.Int32
System.Delegate
public override string ToString()
Target (System.Object)
Method
instance method
Target (boxed)
51. The delegate types
Illustrates, the System.MulticastDelegate can hold multiple delegates
into the delegate. The implementation is the list of delegates.
It contains delegate[]
C#
52. The delegate types
Illustrates, the System.MulticastDelegate can contain multiple delegates
into one delegate. The implementation is the list of delegates.
MulticastDelegate
invocationList
…
delegate[]
[0]
[1]
[2]
…
System.Delegate
…
System.Delegate
…
System.Delegate
…
53. IL2C_REF_HEADER (heap)
IL2C_REF_HEADER (heap)
IL2C_REF_HEADER (heap)IL2C_REF_HEADERIL2C_REF_HEADER (heap)
The delegate types
We know, these instances came from the heap…
MulticastDelegate
invocationList
…
delegate[]
[0]
[1]
[2]
…
System.Delegate
…
System.Delegate
…
System.Delegate
…
Too many allocation!!
54. The delegate types
The IL2C way, all delegate types aggregate into the one implement at
“System_Delegate.” It contains list of delegate and it’s NON-array. Most
important thing is the delegate storage size is VARIABLE.
System_Delegate
…
…
[0] Target
[0] Method
[1] Target
[1] Method
private static void Form_Clicked(
object sender, EventArgs e)
private static void Form_Clicked(
object sender, EventArgs e)
Variable
Invocation table
56. The delegate types
The IL2C’s System_Delegate real declaration:
Target instance (or null)
Function pointer
Invocation list
57. The delegate types
Delegate.Combine() and Delegate.Remove() calculate and store invocation
table only once allocated delegate instance.
C
Allocate once
Delegate.Combine(…)
Combine the invocation tables
58. The delegate types
Overall the IL2C translate to the specialized delegate invoker:
CInvocation table
Invoke by function pointer
For all method
59. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
64. How works exceptions
Important global unwind things:
◦ We have to find the matched exception filter (“catch” type clause.)
◦ If not found, try to find more filters crawling from stack frame top to bottom.
Illustrated:
Function1()
Function2()
Function3()
Function4()
C
Function5()
null
main()
Stack frame top
Stack frame bottom
g_pBeginFrame
66. How works exceptions
Does the method contains multiple exception blocks?
C#
Ignored the exception filter
67. How works exceptions
The unwind feature is likely stack crawling on the stack frame, but
we have to complete exception handling with bit different:
◦ We have to use the “Exception frame.” instead the stack frame.
◦ It’s only instantiate for each exception blocks.
Exception block 2
RaiseNestedBlockLocal()
null
main()
Exception block 1
Exception frame 1
Exception frame 2
g_pTopUnwindTarget
68. How works exceptions
How works the exception frames on the global unwind?
RaiseNestedBlockGlobal()
null
main()
Exception frame 1
g_pTopUnwindTarget
RaiseNestedBlockCallee()
Exception frame 2
Exception block 1
Exception block 2
69. How works exceptions
This is the exception frame and overall usage.
C
Function3()
Exception frame 3
g_pTopUnwindTarget
Function1()
null
main()
Exception frame 1
Exception frame 2
Function2()
This function doesn’t
contain any exception
block
70. How works exceptions
The IL2C handles exception by the way:
1. Raise exception with strongly-typed instance.
2. The runtime crawl the exception frames from “Top exception frame” to
bottom. The “g_pTopUnwindFrame” pointer refers top frame.
3. Call the exception filter function from each exception frames.
4. If found the exception type from exception filter function, IL2C found
catchable block. Send the execution context into it (unwind done).
5. If can’t find any cachable block at the bottom frame, the IL2C raise
“Unhandled exception” state.
71. How works exceptions
What’s the exception filter?
C
int16_t Function1_ExceptionFilter_0(
System_Exception* ex)
72. How works exceptions
What’s the exception filter?
C
Index number for local
exception frame
Method name
“catch (Exception)”
Result is filter number
(unique in each filter)
73. How works exceptions
If a exception block contain multiple caught blocks?
C#
Contains multiple caught
into one exception block
74. How works exceptions
If a exception block contain multiple caught blocks?
C
Result is filter number
Series checks exception type
75. How works exceptions
The “Filter number” is used for branching identity at the method
body. Illustrated bone of the exception block:
C
Use this filter function
The filter number
77. How works exceptions
C
Exception frame
declared each try block
branch using setjmp() result
to try and catch block
Saved current
execution context
78. How works exceptions
C
Use longjmp() function. The execution
context will unwind to “saved.”
How the exception raising side?
79. How works exceptions
A lot… a lot of topics for the exception handling, this session
uncovererd:
◦ How works “rethrow” feature?
◦ How works nested rethrow feature?
◦ How works nested local exception block?
◦ How works finally block?
◦ How works filter-fault blocks?
◦ The setjmp and longjmp way (named sjlj) is slower. Can we improve by another
way?
◦ How works asynchronous exception? (NullReferenceException,
ArithmeticException and etc…)
80. Agenda
Abstract
What’s the IL2C
◦ Building scheme
Translation details
◦ The runtime types – primitive and string
◦ How works the garbage collector
◦ The value type / boxing
◦ The enum types
◦ The delegate types
◦ How works exceptions
◦ How works virtual methods (virtual, override and interface implementations)
81. How works virtual methods
The runtime type information:
IL2C_REF_HEADER (heap)
pNext
type
gcMark
System_String
vptr0__
string_body__
“ABCDEFGHIJ0”
IL2C_RUNTIME_TYPE_DECL (.rdata)
pTypeName
flags
bodySize
baseType
vptr0
markTarget
interfaceCount
…
“System.String”
(UTF8 string)
[System.Object]
IL2C_RUNTIME_TYPE_DECL
Copy
…
System_String_VTABLE__
…
82. How works virtual methods
C
IL2C_TYPE_REFERENCE (objref)
IL2C_TYPE_VALUE (value type)
IL2C_TYPE_VARIABLE (variable storage size)
…
The C language not valid using for sizeof()
expression via empty storage space type.
So this field coverered size is 0.
83. How works virtual methods
The virtual method table (VTABLE) works:
IL2C_RUNTIME_TYPE_DECL
…
vptr0
…
System_String
vptr0__
string_body__
System_String_VTABLE (.rdata)
offset__
System_String_Equals(…)
System_Object_Finalize(…)
System_String_GetHashCode(…)
System_String_ToString(…)
System_String_GetHashCode(
System_String* this)
0
85. How works virtual methods
How works invoking virtual method? C
VirtualBaseType_VTABLE__
offset__
GetStringFromInt32(…)
… VirtualBaseType_GetStringFromInt32(
VirtualBaseType* this, int32_t value)
0
vptr0__
86. How works virtual methods
How works invoking virtual method? C
VirtualBaseType_VTABLE__
offset__
GetStringFromInt32(…)
…
0
VirtualBaseType_GetStringFromInt32(
VirtualBaseType* this, int32_t value)
Subtract offset from this pointer.
But the “offset__” field always 0…
arg0 = this - offset__
87. How works virtual methods
The interface virtual method table works:
C#
Interface implemented
88. How works virtual methods
The interface virtual method table works:
TYPE [VirtualNewAndImplementType]
…
VirtualNewAndImplementType
VirtualNewAndImplementType_VTABLE
VirtualNewAndImplementType_GetStringFromInt32(
VirtualNewAndImplementType* this)
vptr0__
vptr0__
vptr_IInterfaceType1
…
…
VirtualNewAndImplementType_IInterfaceType1_VTABLE
offset__
TYPE [IInterfaceType1]
…
interfaceType
vptrInterface
void GetStringFromInt32 ()
offset__
0
4
+0
+4
arg0 = this - offset__
The interface VTABLE
contains this pointer’s offset.
Subtract offset to adjust
pointer.
89. How works virtual methods
How works invoking interface virtual method? C
VirtualNewImplements_IInterfaceType1_VTABLE
offset__
GetStringFromInt32(…)
…
4
True same calculation both the class
VTABLE and interface VTABLE.
arg0 = this - offset__
VirtualNewAndImplementType_GetStringFromInt32(
VirtualNewAndImplementType* this)
92. How works virtual methods C#
1. Find the interface
from mostly derived
2. Listup methods from
found to same signature
3. Find the method from mostly
overrided and NOT newslot