Non-Blocking Strategies for FFI
Don’t Block me Now!
Pablo Tesone
Pharo Consortium Engineer
1
Guille Polito
CNRS UMR9189

CRIStAL, Inria

RMoD
Pablo Tesone
Pharo Consortium
Engineer
Guille Polito
CNRS Engineer

RMod Team
• 10 years of experience in industrial
applications

• PhD in Dynamic Software Update

• Interested in improving
development tools and the daily
development process. 

• Enthusiast of the object oriented
programming and their tools.
• Experience industrial on service-
oriented and mobile applications.

• PhD in Computer Science

• Main research interests are
modularity and development tools.

• In the Pharo community since 2010

• More noticeable contributions:
Pharo Bootstrap process and
Iceberg.
FFI? Foreign Function Interface
Image
VM
External Libraries
FFI
We can communicate with anything that
has a C API
Operating System API
!3
Unified FFI in a nutshell
UFFI handles:
- Look-up of functions
- Marshalling of arguments
- Execution
- Marshalling of the return values
Concurrency in Pharo
!5
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
VM Thread
Concurrency in Pharo
!6
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
VM
handles
process
scheduling
VM Thread
Concurrency in Pharo
!7
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
p1
int function(char* foo, int bar)
VM Thread
Concurrency in Pharo
!8
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
p1
Out
of
VM
VM
loses
control
int function(char* foo, int bar)
VM Thread
Conceptual Non-Blocking FFI
!9
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
p1
p3
p2
?
int function(char* foo, int bar)
VM Thread
Strategy #1: Thread per Call-out
!10
Interpreter
Thread
p1
p2
p1
p2
Thread 1
<<Spawn>>
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2 VM Thread
Strategy #1: Thread per Call-out
!11
Interpreter
Thread
p1
p2
p1
p2
Thread 1
<<Spawn>>
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2
• Simple
• Expensive to spawn threads
• Calls are not in the same thread
• Cannot reuse existing threads (e.g., UI threads)
Not all libraries are
designed equally
• Different requirements

• Must run in the main thread (Cocoa)

• Must run in a single thread (Gtk+3)

• Runs on any thread but not concurrent (libgit, sqlite)

• Is a Thread-safe Library

• ….
!12
We need different
Strategies to choose from
!13
We need to choose different
strategies for each library
Strategy #2: Worker Threads
!14
Interpreter
Thread
p1
p2
p1
p2
Worker #1
<<Enqueue>>
Call out Queue
of Worker #1
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2 VM Thread
Strategy #2: Worker Threads
!15
Interpreter
Thread
p1
p2
p1
p2
Worker #1
<<Enqueue>>
Call out Queue
of Worker #1
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2
• Simple
• Group related calls
• No thread spawn overhead
• Expensive Callouts (synchronising queue)
• Do not support main thread!
Strategy #3: VM Thread Runner
!16
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
int function(char* foo, int bar)
p1
VM Thread
Strategy #3: VM Thread Runner
!17
P1 P2 P3
Interpreter
Thread
p1
p3
p1
p2
p2
int function(char* foo, int bar)
p1
• Simpler
• Group related calls
• No thread spawn overhead
• Backward compatibility
• Blocking
Strategy #4: Main Thread Runner
Interpreter
Thread
p1
p2
p1
p2
Worker #1
<<Enqueue>>
Call out Queue
of Worker #1
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2 VM Thread
Strategy #4: Main Thread Runner
Interpreter
Thread
p1
p2
p1
p2
Worker #1
<<Enqueue>>
Call out Queue
of Worker #1
<<Signal Semaphore>>
int function(char* foo, int bar)
P1 P2
• Simple
• Group related calls
• No thread spawn overhead
• Supports main thread
• Expensive Callouts (synchronising queue)
• VM should be run in separate thread
Strategy #5: Global Interpreter Lock
int function(char* foo, int bar)
P1 P2
Interpreter
#1
Interpreter
#2
!20
VM Thread #1 VM Thread #2
Strategy #5: Global Interpreter Lock
int function(char* foo, int bar)
P1 P2
Interpreter
#1
Interpreter
#2
!21
• Group related calls
• No thread spawn overhead
• No Backward compatibility
• Application should be written with threading
in mind
• Requires VM modification
Strategy #6: Thread-safe interpreters
int function(char* foo, int bar)
!22
P1 P2
Interpreter
#1
Interpreter
#2
VM Thread #1 VM Thread #2
Strategy #6: Thread-safe interpreters
int function(char* foo, int bar)
!23
P1 P2
Interpreter
#1
Interpreter
#2
• Does not exist for Pharo
• Requires extensive modification of VM,
Plugins and Image core libraries
• Application should be written with threading
in mind
• Real multithreading not only for FFI
Implementations
!24
Queue Based FFI
(Pharo Threaded FFI Plugin)
GILda VM
(Global Interpreter Lock VM)
Strategy #1: Thread per Call-out
Strategy #2: Worker Threads
Strategy #3: VM Thread Runner
Strategy #5: Global Interpreter Lock
Strategy #4: Main Thread Runner
Future???
Strategy #6: Thread-safe
interpreters
Implementations
!25
Queue Based FFI
(Pharo Threaded FFI Plugin)
GILda VM
(Global Interpreter Lock VM)
Strategy #1: Thread per Call-out
Strategy #2: Worker Threads
Strategy #3: VM Thread Runner
Strategy #5: Global Interpreter Lock
Strategy #4: Main Thread Runner
Future???
Strategy #6: Thread-safe
interpreters
Transparency through
UFFI
!26
Your Program
Unified FFI
Pharo Threaded FFI
Pharo Threaded FFI Plugin
S#1 S#2 S#3 S#4
Squeak FFI
Squeak FFI
Plugin
Library
Decision Table
!27
Same Thread Worker Threads
Long Calls
Blocking Parallel
Short Calls
Little Overhead Lots of Overhead
OPTIMIZATIONS
Implementations
!28
GILda VM
(Global Interpreter Lock VM)
Strategy #5: Global Interpreter Lock
Queue Based FFI
(Pharo Threaded FFI Plugin)
Strategy #1: Thread per Call-out
Strategy #2: Worker Threads
Strategy #3: VM Thread Runner
Strategy #4: Main Thread Runner
Future???
Strategy #6: Thread-safe
interpreters
More Details
IWST - Virtual Machines Session
Thursday 11:00 am
Start Using It!
!29
Metacello new
baseline: ‘ThreadedFFI'
repository:
‘github://pharo-project/threadedFFI-Plugin/src';
load
pharo-project/threadedFFI-Plugin
Only in Pharo 8 + Headless VM
Conclusion
• Beta Version (In usage for Gtk+)

• Transparent for the user

• Strategies selected in the Image

• Uses LibFFI

• Re-entrant Callback support

• Tests!
!30
Preliminar results
• All marshalling image side + Lib FFI

• Short call

• Same thread 27 us

• Single worker thread 6791 us

• 2 Parallel long Calls (1 second per call)

• Same Thread 2001.9 ms

• Different working threads 1006.4ms

Non-Blocking Strategies for FFI

  • 1.
    Non-Blocking Strategies forFFI Don’t Block me Now! Pablo Tesone Pharo Consortium Engineer 1 Guille Polito CNRS UMR9189 CRIStAL, Inria RMoD
  • 2.
    Pablo Tesone Pharo Consortium Engineer GuillePolito CNRS Engineer RMod Team • 10 years of experience in industrial applications • PhD in Dynamic Software Update • Interested in improving development tools and the daily development process. • Enthusiast of the object oriented programming and their tools. • Experience industrial on service- oriented and mobile applications. • PhD in Computer Science • Main research interests are modularity and development tools. • In the Pharo community since 2010 • More noticeable contributions: Pharo Bootstrap process and Iceberg.
  • 3.
    FFI? Foreign FunctionInterface Image VM External Libraries FFI We can communicate with anything that has a C API Operating System API !3
  • 4.
    Unified FFI ina nutshell UFFI handles: - Look-up of functions - Marshalling of arguments - Execution - Marshalling of the return values
  • 5.
    Concurrency in Pharo !5 P1P2 P3 Interpreter Thread p1 p3 p1 p2 p2 VM Thread
  • 6.
    Concurrency in Pharo !6 P1P2 P3 Interpreter Thread p1 p3 p1 p2 p2 VM handles process scheduling VM Thread
  • 7.
    Concurrency in Pharo !7 P1P2 P3 Interpreter Thread p1 p3 p1 p2 p2 p1 int function(char* foo, int bar) VM Thread
  • 8.
    Concurrency in Pharo !8 P1P2 P3 Interpreter Thread p1 p3 p1 p2 p2 p1 Out of VM VM loses control int function(char* foo, int bar) VM Thread
  • 9.
    Conceptual Non-Blocking FFI !9 P1P2 P3 Interpreter Thread p1 p3 p1 p2 p2 p1 p3 p2 ? int function(char* foo, int bar) VM Thread
  • 10.
    Strategy #1: Threadper Call-out !10 Interpreter Thread p1 p2 p1 p2 Thread 1 <<Spawn>> <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 VM Thread
  • 11.
    Strategy #1: Threadper Call-out !11 Interpreter Thread p1 p2 p1 p2 Thread 1 <<Spawn>> <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 • Simple • Expensive to spawn threads • Calls are not in the same thread • Cannot reuse existing threads (e.g., UI threads)
  • 12.
    Not all librariesare designed equally • Different requirements • Must run in the main thread (Cocoa) • Must run in a single thread (Gtk+3) • Runs on any thread but not concurrent (libgit, sqlite) • Is a Thread-safe Library • …. !12
  • 13.
    We need different Strategiesto choose from !13 We need to choose different strategies for each library
  • 14.
    Strategy #2: WorkerThreads !14 Interpreter Thread p1 p2 p1 p2 Worker #1 <<Enqueue>> Call out Queue of Worker #1 <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 VM Thread
  • 15.
    Strategy #2: WorkerThreads !15 Interpreter Thread p1 p2 p1 p2 Worker #1 <<Enqueue>> Call out Queue of Worker #1 <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 • Simple • Group related calls • No thread spawn overhead • Expensive Callouts (synchronising queue) • Do not support main thread!
  • 16.
    Strategy #3: VMThread Runner !16 P1 P2 P3 Interpreter Thread p1 p3 p1 p2 p2 int function(char* foo, int bar) p1 VM Thread
  • 17.
    Strategy #3: VMThread Runner !17 P1 P2 P3 Interpreter Thread p1 p3 p1 p2 p2 int function(char* foo, int bar) p1 • Simpler • Group related calls • No thread spawn overhead • Backward compatibility • Blocking
  • 18.
    Strategy #4: MainThread Runner Interpreter Thread p1 p2 p1 p2 Worker #1 <<Enqueue>> Call out Queue of Worker #1 <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 VM Thread
  • 19.
    Strategy #4: MainThread Runner Interpreter Thread p1 p2 p1 p2 Worker #1 <<Enqueue>> Call out Queue of Worker #1 <<Signal Semaphore>> int function(char* foo, int bar) P1 P2 • Simple • Group related calls • No thread spawn overhead • Supports main thread • Expensive Callouts (synchronising queue) • VM should be run in separate thread
  • 20.
    Strategy #5: GlobalInterpreter Lock int function(char* foo, int bar) P1 P2 Interpreter #1 Interpreter #2 !20 VM Thread #1 VM Thread #2
  • 21.
    Strategy #5: GlobalInterpreter Lock int function(char* foo, int bar) P1 P2 Interpreter #1 Interpreter #2 !21 • Group related calls • No thread spawn overhead • No Backward compatibility • Application should be written with threading in mind • Requires VM modification
  • 22.
    Strategy #6: Thread-safeinterpreters int function(char* foo, int bar) !22 P1 P2 Interpreter #1 Interpreter #2 VM Thread #1 VM Thread #2
  • 23.
    Strategy #6: Thread-safeinterpreters int function(char* foo, int bar) !23 P1 P2 Interpreter #1 Interpreter #2 • Does not exist for Pharo • Requires extensive modification of VM, Plugins and Image core libraries • Application should be written with threading in mind • Real multithreading not only for FFI
  • 24.
    Implementations !24 Queue Based FFI (PharoThreaded FFI Plugin) GILda VM (Global Interpreter Lock VM) Strategy #1: Thread per Call-out Strategy #2: Worker Threads Strategy #3: VM Thread Runner Strategy #5: Global Interpreter Lock Strategy #4: Main Thread Runner Future??? Strategy #6: Thread-safe interpreters
  • 25.
    Implementations !25 Queue Based FFI (PharoThreaded FFI Plugin) GILda VM (Global Interpreter Lock VM) Strategy #1: Thread per Call-out Strategy #2: Worker Threads Strategy #3: VM Thread Runner Strategy #5: Global Interpreter Lock Strategy #4: Main Thread Runner Future??? Strategy #6: Thread-safe interpreters
  • 26.
    Transparency through UFFI !26 Your Program UnifiedFFI Pharo Threaded FFI Pharo Threaded FFI Plugin S#1 S#2 S#3 S#4 Squeak FFI Squeak FFI Plugin Library
  • 27.
    Decision Table !27 Same ThreadWorker Threads Long Calls Blocking Parallel Short Calls Little Overhead Lots of Overhead OPTIMIZATIONS
  • 28.
    Implementations !28 GILda VM (Global InterpreterLock VM) Strategy #5: Global Interpreter Lock Queue Based FFI (Pharo Threaded FFI Plugin) Strategy #1: Thread per Call-out Strategy #2: Worker Threads Strategy #3: VM Thread Runner Strategy #4: Main Thread Runner Future??? Strategy #6: Thread-safe interpreters More Details IWST - Virtual Machines Session Thursday 11:00 am
  • 29.
    Start Using It! !29 Metacellonew baseline: ‘ThreadedFFI' repository: ‘github://pharo-project/threadedFFI-Plugin/src'; load pharo-project/threadedFFI-Plugin Only in Pharo 8 + Headless VM
  • 30.
    Conclusion • Beta Version(In usage for Gtk+) • Transparent for the user • Strategies selected in the Image • Uses LibFFI • Re-entrant Callback support • Tests! !30
  • 31.
    Preliminar results • Allmarshalling image side + Lib FFI • Short call • Same thread 27 us • Single worker thread 6791 us • 2 Parallel long Calls (1 second per call) • Same Thread 2001.9 ms • Different working threads 1006.4ms