An insane idea on reference counting
Gleb Smirnoff
glebius@FreeBSD.org
BSDCan 2014
Ottawa
May 16, 2014
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 1 / 28
Introduction
The configuration problem
Problem:
A structure in memory describing a “configuration”
Multiple readers at high rate
Sporadic writers
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 2 / 28
Introduction
The configuration problem
Problem:
A structure in memory describing a “configuration”
Multiple readers at high rate
Sporadic writers
Examples:
local IP addresses hash
interfaces list, particular ifnet
firewall rules
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 2 / 28
Introduction
Ready to use solutions
What we use in FreeBSD:
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
Introduction
Ready to use solutions
What we use in FreeBSD:
rwlock(9)
Acquiring thread == Releasing thread
Very expensive: all readers do atomic(9) on the same
word
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
Introduction
Ready to use solutions
What we use in FreeBSD:
rwlock(9)
Acquiring thread == Releasing thread
Very expensive: all readers do atomic(9) on the same
word
rmlock(9)
Acquiring thread == Releasing thread
Does sched_pin(9) for the entire operation
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
Introduction
Ready to use solutions
What we use in FreeBSD:
rwlock(9)
Acquiring thread == Releasing thread
Very expensive: all readers do atomic(9) on the same
word
rmlock(9)
Acquiring thread == Releasing thread
Does sched_pin(9) for the entire operation
refcount(9)
Acquiring thread != Releasing thread
Expensive: is atomic(9)
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
Introduction
Patented solution
RCU
Acquiring thread == Releasing thread
Patented :(
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 4 / 28
Introduction
Goals
Ultra lightweight for a reader
Acquiring thread != Releasing thread
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 5 / 28
Introduction
Goals
Ultra lightweight for a reader
Acquiring thread != Releasing thread
Sounds like refcount(9) w/o atomics.
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 5 / 28
counter(9) based refcounter API
counter(9) as refcount?
counter(9) - new facility in FreeBSD 10
counter_u64_t cnt;
cnt = counter_u64_alloc (M_WAITOK);
counter_u64_add (cnt , 1);
lightweight, due to per-CPU memory
counter_u64_add is single instruction on amd64
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 6 / 28
counter(9) based refcounter API
Suggested API
struct lwref {
void *ptr;
counter_u64_t cnt;
};
typedef struct lwref * lwref_t;
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 7 / 28
counter(9) based refcounter API
Suggested API
void *lwref_acquire(lwref_t lwr , counter_u64_t *c);
Returns the “configuration” pointer from lwr
Increments counter(9) in lwr
Returns the counter(9)
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 8 / 28
counter(9) based refcounter API
Suggested API
void lwref_change(lwref_t lwr , void *newptr ,
void (* freefn)(void *, void *), void *freearg);
Changes the “configuration” pointer in lwr to newptr
Allocates new counter(9) for the lwr
Asynchronously frees old pointer and old counter(9)
when it is safe
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 9 / 28
counter(9) based refcounter API
Suggested API
lwref_acquire must be safe against lwref_change
lwref_acquire must not be expensive
lwref_change is allowed to be expensive
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 10 / 28
implementation restartable function
naive racy lwref_acquire
void *
lwref_acquire(lwref_t lwr , counter_u64_t *cp)
{
void *ptr;
ptr = lwr ->ptr;
cp = &lwr ->cnt;
counter_u64_add (*cp , 1);
return (ptr);
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 11 / 28
implementation restartable function
Hypothetical lwref_change operation
Update contents of lwref_t on all CPUs
Check if lwref_acquire is running on any CPU
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 12 / 28
implementation restartable function
Hypothetical lwref_change operation
Update contents of lwref_t on all CPUs
Check if lwref_acquire is running on any CPU
How check that?
And what if it is running?
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 12 / 28
implementation restartable function
lwref_change is SMP rendezvous
void
lwref_change(lwref_t lwr , void *newptr ,
void (* freefn)(void *, void *), void *freearg)
{
struct lwref_change_ctx ctx;
ctx ->lwr = lwr;
ctx ->newptr = newptr;
ctx ->newcnt = counter_u64_alloc ();
smp_rendezvous (lwref_change_action , &ctx);
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 13 / 28
implementation restartable function
lwref_change_action code
void
lwref_change_action (void *v)
{
struct lwref_change_ctx *ctx = v;
lwref_t lwr = ctx ->lwr;
lwr ->ptr = ctx ->newptr;
lwr ->refcnt = ctx ->newcnt;
/*
* Check if we interrupted lwref_acquire ().
*/
...
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 14 / 28
implementation restartable function
interruption possibilities
The rendezvous IPI interrupted lwref_acquire
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 15 / 28
implementation restartable function
interruption possibilities
The rendezvous IPI interrupted lwref_acquire
Any other interrupt (usually timer) interrupted
lwref_acquire and the thread went on scheduler’s run
queue, prior to lwref_change execution
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 15 / 28
implementation restartable function
restartable lwref_acquire
ENTRY(lwref_acquire)
mov (%rdi), %rax
mov 0x8(%rdi), %rcx
mov %rcx , (%rsi)
mov $__pcpu , %rdx
sub %rdx , %rcx
addq $1 , %gs:(% rcx)
ret
END(lwref_acquire)
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 16 / 28
implementation restartable function
restartable lwref_acquire
ENTRY(lwref_acquire)
mov (%rdi), %rax
mov 0x8(%rdi), %rcx
mov %rcx , (%rsi)
mov $__pcpu , %rdx
sub %rdx , %rcx
addq $1 , %gs:(% rcx)
.globl lwref_acquire_ponr
lwref_acquire_ponr :
ret
END(lwref_acquire)
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 17 / 28
implementation restartable function
When restart?
Option 1: whenever any interrupt interrupts
lwref_acquire
Option 2: whenever lwref_change interrupts
lwref_acquire
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 18 / 28
implementation any interrupt rolls back
Option 1: Any interrupt rolls back
The PUSH_FRAME() macro in
amd64/include/asmacros.h should check and fix up %rip
in pushed frame.
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 19 / 28
implementation any interrupt rolls back
Option 1: Any interrupt rolls back
The PUSH_FRAME() macro in
amd64/include/asmacros.h should check and fix up %rip
in pushed frame.
Pros: very simple
Cons: extra instructions on every interrupt
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 19 / 28
implementation any interrupt rolls back
change to PUSH_FRAME() macro
@@ -167,7 +167 ,14 @@
movw %es ,TF_ES (%rsp) ;
movw %ds ,TF_DS (%rsp) ;
movl $TF_HASSEGS ,TF_FLAGS (%rsp) ;
- cld
+ movq TF_RIP (%rsp), %rax ;
+ cmpq %rax , lwref_acquire ;
+ jb 2f ;
+ cmpq %rax , lwref_acquire_ponr ;
+ jae 2f ;
+ movq lwref_acquire , %rax ;
+ movq %rax , TF_RIP (%rsp) ;
+2: cld
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 20 / 28
implementation lwref_change_action rolls back
Option 2: lwref_change rolls back
void
lwref_change_action (void *v)
{
struct trapframe *tf;
...
/*
* Check if we interrupted lwref_acquire ().
*/
tf = (struct trapframe *)
(( register_t *) __builtin_frame_address (1) + 2);
lwref_fixup_rip (&tf ->tf_rip);
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 21 / 28
implementation lwref_change_action rolls back
lwref_fixup_rip
static void
lwref_fixup_rip (register_t *rip)
{
if (*rip >= (register_t )lwref_acquire &&
*rip < (register_t ) lwref_acquire_ponr )
*rip = (register_t )lwref_acquire;
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 22 / 28
implementation lwref_change_action rolls back
What about scheduler run queues?
New function:
void sched_foreach_on_runq (void (*)(void *));
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 23 / 28
implementation lwref_change_action rolls back
lwref_change rolls back (continued)
void
lwref_change_action (void *v)
{
...
sched_foreach_on_runq ( lwref_fixup_td );
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 24 / 28
implementation lwref_change_action rolls back
naive lwref_fixup_td
static void
lwref_fixup_td (void *arg)
{
struct thread *td = arg;
tf = (struct trapframe *)
(( register_t *) (***( void ****)(td ->td_pcb ->
pcb_rbp)) + 2);
lwref_fixup_rip (&tf ->tf_rip);
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 25 / 28
implementation lwref_change_action rolls back
static void
lwref_fixup_td (void *arg)
{
struct thread *td = arg;
struct trapframe *tf;
register_t *rbp , rip;
for (rbp = (register_t *)td ->td_pcb ->pcb_rbp;
rbp && rbp < (register_t *)*rbp;
rbp = (register_t *)*rbp) {
rip = (register_t )*(rbp + 1);
if (rip == (register_t )timerint_ret ||
...
rip == (register_t )
ipi_intr_bitmap_handler_ret ) {
tf = (struct trapframe *)(rbp + 2);
lwref_fixup_rip (&tf ->tf_rip);
}
}
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 26 / 28
implementation lwref_change_action rolls back
hint from jhb@
Use td->td_frame to get access to frame :)
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 27 / 28
implementation lwref_change_action rolls back
Questions?
Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 28 / 28

lwref: insane idea on reference counting

  • 1.
    An insane ideaon reference counting Gleb Smirnoff glebius@FreeBSD.org BSDCan 2014 Ottawa May 16, 2014 Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 1 / 28
  • 2.
    Introduction The configuration problem Problem: Astructure in memory describing a “configuration” Multiple readers at high rate Sporadic writers Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 2 / 28
  • 3.
    Introduction The configuration problem Problem: Astructure in memory describing a “configuration” Multiple readers at high rate Sporadic writers Examples: local IP addresses hash interfaces list, particular ifnet firewall rules Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 2 / 28
  • 4.
    Introduction Ready to usesolutions What we use in FreeBSD: Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
  • 5.
    Introduction Ready to usesolutions What we use in FreeBSD: rwlock(9) Acquiring thread == Releasing thread Very expensive: all readers do atomic(9) on the same word Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
  • 6.
    Introduction Ready to usesolutions What we use in FreeBSD: rwlock(9) Acquiring thread == Releasing thread Very expensive: all readers do atomic(9) on the same word rmlock(9) Acquiring thread == Releasing thread Does sched_pin(9) for the entire operation Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
  • 7.
    Introduction Ready to usesolutions What we use in FreeBSD: rwlock(9) Acquiring thread == Releasing thread Very expensive: all readers do atomic(9) on the same word rmlock(9) Acquiring thread == Releasing thread Does sched_pin(9) for the entire operation refcount(9) Acquiring thread != Releasing thread Expensive: is atomic(9) Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 3 / 28
  • 8.
    Introduction Patented solution RCU Acquiring thread== Releasing thread Patented :( Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 4 / 28
  • 9.
    Introduction Goals Ultra lightweight fora reader Acquiring thread != Releasing thread Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 5 / 28
  • 10.
    Introduction Goals Ultra lightweight fora reader Acquiring thread != Releasing thread Sounds like refcount(9) w/o atomics. Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 5 / 28
  • 11.
    counter(9) based refcounterAPI counter(9) as refcount? counter(9) - new facility in FreeBSD 10 counter_u64_t cnt; cnt = counter_u64_alloc (M_WAITOK); counter_u64_add (cnt , 1); lightweight, due to per-CPU memory counter_u64_add is single instruction on amd64 Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 6 / 28
  • 12.
    counter(9) based refcounterAPI Suggested API struct lwref { void *ptr; counter_u64_t cnt; }; typedef struct lwref * lwref_t; Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 7 / 28
  • 13.
    counter(9) based refcounterAPI Suggested API void *lwref_acquire(lwref_t lwr , counter_u64_t *c); Returns the “configuration” pointer from lwr Increments counter(9) in lwr Returns the counter(9) Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 8 / 28
  • 14.
    counter(9) based refcounterAPI Suggested API void lwref_change(lwref_t lwr , void *newptr , void (* freefn)(void *, void *), void *freearg); Changes the “configuration” pointer in lwr to newptr Allocates new counter(9) for the lwr Asynchronously frees old pointer and old counter(9) when it is safe Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 9 / 28
  • 15.
    counter(9) based refcounterAPI Suggested API lwref_acquire must be safe against lwref_change lwref_acquire must not be expensive lwref_change is allowed to be expensive Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 10 / 28
  • 16.
    implementation restartable function naiveracy lwref_acquire void * lwref_acquire(lwref_t lwr , counter_u64_t *cp) { void *ptr; ptr = lwr ->ptr; cp = &lwr ->cnt; counter_u64_add (*cp , 1); return (ptr); } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 11 / 28
  • 17.
    implementation restartable function Hypotheticallwref_change operation Update contents of lwref_t on all CPUs Check if lwref_acquire is running on any CPU Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 12 / 28
  • 18.
    implementation restartable function Hypotheticallwref_change operation Update contents of lwref_t on all CPUs Check if lwref_acquire is running on any CPU How check that? And what if it is running? Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 12 / 28
  • 19.
    implementation restartable function lwref_changeis SMP rendezvous void lwref_change(lwref_t lwr , void *newptr , void (* freefn)(void *, void *), void *freearg) { struct lwref_change_ctx ctx; ctx ->lwr = lwr; ctx ->newptr = newptr; ctx ->newcnt = counter_u64_alloc (); smp_rendezvous (lwref_change_action , &ctx); } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 13 / 28
  • 20.
    implementation restartable function lwref_change_actioncode void lwref_change_action (void *v) { struct lwref_change_ctx *ctx = v; lwref_t lwr = ctx ->lwr; lwr ->ptr = ctx ->newptr; lwr ->refcnt = ctx ->newcnt; /* * Check if we interrupted lwref_acquire (). */ ... } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 14 / 28
  • 21.
    implementation restartable function interruptionpossibilities The rendezvous IPI interrupted lwref_acquire Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 15 / 28
  • 22.
    implementation restartable function interruptionpossibilities The rendezvous IPI interrupted lwref_acquire Any other interrupt (usually timer) interrupted lwref_acquire and the thread went on scheduler’s run queue, prior to lwref_change execution Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 15 / 28
  • 23.
    implementation restartable function restartablelwref_acquire ENTRY(lwref_acquire) mov (%rdi), %rax mov 0x8(%rdi), %rcx mov %rcx , (%rsi) mov $__pcpu , %rdx sub %rdx , %rcx addq $1 , %gs:(% rcx) ret END(lwref_acquire) Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 16 / 28
  • 24.
    implementation restartable function restartablelwref_acquire ENTRY(lwref_acquire) mov (%rdi), %rax mov 0x8(%rdi), %rcx mov %rcx , (%rsi) mov $__pcpu , %rdx sub %rdx , %rcx addq $1 , %gs:(% rcx) .globl lwref_acquire_ponr lwref_acquire_ponr : ret END(lwref_acquire) Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 17 / 28
  • 25.
    implementation restartable function Whenrestart? Option 1: whenever any interrupt interrupts lwref_acquire Option 2: whenever lwref_change interrupts lwref_acquire Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 18 / 28
  • 26.
    implementation any interruptrolls back Option 1: Any interrupt rolls back The PUSH_FRAME() macro in amd64/include/asmacros.h should check and fix up %rip in pushed frame. Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 19 / 28
  • 27.
    implementation any interruptrolls back Option 1: Any interrupt rolls back The PUSH_FRAME() macro in amd64/include/asmacros.h should check and fix up %rip in pushed frame. Pros: very simple Cons: extra instructions on every interrupt Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 19 / 28
  • 28.
    implementation any interruptrolls back change to PUSH_FRAME() macro @@ -167,7 +167 ,14 @@ movw %es ,TF_ES (%rsp) ; movw %ds ,TF_DS (%rsp) ; movl $TF_HASSEGS ,TF_FLAGS (%rsp) ; - cld + movq TF_RIP (%rsp), %rax ; + cmpq %rax , lwref_acquire ; + jb 2f ; + cmpq %rax , lwref_acquire_ponr ; + jae 2f ; + movq lwref_acquire , %rax ; + movq %rax , TF_RIP (%rsp) ; +2: cld Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 20 / 28
  • 29.
    implementation lwref_change_action rollsback Option 2: lwref_change rolls back void lwref_change_action (void *v) { struct trapframe *tf; ... /* * Check if we interrupted lwref_acquire (). */ tf = (struct trapframe *) (( register_t *) __builtin_frame_address (1) + 2); lwref_fixup_rip (&tf ->tf_rip); } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 21 / 28
  • 30.
    implementation lwref_change_action rollsback lwref_fixup_rip static void lwref_fixup_rip (register_t *rip) { if (*rip >= (register_t )lwref_acquire && *rip < (register_t ) lwref_acquire_ponr ) *rip = (register_t )lwref_acquire; } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 22 / 28
  • 31.
    implementation lwref_change_action rollsback What about scheduler run queues? New function: void sched_foreach_on_runq (void (*)(void *)); Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 23 / 28
  • 32.
    implementation lwref_change_action rollsback lwref_change rolls back (continued) void lwref_change_action (void *v) { ... sched_foreach_on_runq ( lwref_fixup_td ); } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 24 / 28
  • 33.
    implementation lwref_change_action rollsback naive lwref_fixup_td static void lwref_fixup_td (void *arg) { struct thread *td = arg; tf = (struct trapframe *) (( register_t *) (***( void ****)(td ->td_pcb -> pcb_rbp)) + 2); lwref_fixup_rip (&tf ->tf_rip); } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 25 / 28
  • 34.
    implementation lwref_change_action rollsback static void lwref_fixup_td (void *arg) { struct thread *td = arg; struct trapframe *tf; register_t *rbp , rip; for (rbp = (register_t *)td ->td_pcb ->pcb_rbp; rbp && rbp < (register_t *)*rbp; rbp = (register_t *)*rbp) { rip = (register_t )*(rbp + 1); if (rip == (register_t )timerint_ret || ... rip == (register_t ) ipi_intr_bitmap_handler_ret ) { tf = (struct trapframe *)(rbp + 2); lwref_fixup_rip (&tf ->tf_rip); } } Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 26 / 28
  • 35.
    implementation lwref_change_action rollsback hint from jhb@ Use td->td_frame to get access to frame :) Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 27 / 28
  • 36.
    implementation lwref_change_action rollsback Questions? Gleb Smirnoff glebius@FreeBSD.org An insane idea on reference counting May 16, 2014 28 / 28