Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
550
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. R(ead) C(opy) U(pdate)‏ [email_address]
  • 2. Agenda
    • What is RCU? Why?
    • RCU Primitives
    • RCU List Operations
    • Sleepable RCU
    • User Level RCU
    • Q&A
  • 3. What is RCU?
    • Read-copy-update
    • An alternative of rwlock
    • Allow low over-head wait-free read
    • Update can be expensive: need to maintain old copies if in use
  • 4. Why RCU?
    • W/o lock, this is broken due to compiler optimization and CPU out-of-order exec
    • 1 struct foo {
    • 2 int a;
    • 3 int b;
    • 4 int c;
    • 5 };
    • 6 struct foo *gp = NULL;
    • 7
    • 8 /* . . . */
    • 9
    • 10 p = kmalloc(sizeof(*p), GFP_KERNEL);
    • 11 p->a = 1;
    • 12 p->b = 2;
    • 13 p->c = 3;
    • 14 gp = p;
  • 5. Why RCU?
    • Mutex, no concurrent readers
    • Spin_lock, ditto
    • Rwlock, allow concurrent readers. The right choice?
  • 6. Why RCU?
    • rwlock is expensive
    • Even read_lock has more overhead than spin_lock
    • If write_lock is not really rare, rwlock contention is much worse than spin_lock contension
  • 7. RCU Basis
    • Split update into removal and reclamation phases
    • Removal is performed immediately, while reclamation is deferred until all readers active during the removal phase have completed
    • Takes advantage of the fact that writes to single aligned pointers are atomic on modern CPUs
  • 8. RCU Terminology
    • read-side critical sections: code delimited by rcu_read_lock() and rcu_read_unlock(), MUST NOT sleep.
    • quiescent state: any code not within an RCU read-side critical section
    • grace period: any time period during which each thread resides at least one quiescent state
  • 9. RCU Terminology
    • More on grace period: after a full grace period, all pre-existing RCU read-side critical sections are completed.
  • 10. RCU Update Sequence
    • Remove pointers to a data structure, so that subsequent readers cannot gain a reference to it
    • Wait for all previous readers to complete their RCU read-side critical sections (AKA, a grace period passes)‏
    • At this point, there cannot be any readers who hold references to the data structure, so it now may safely be reclaimed (e.g., in another thread)‏
  • 11. When Grace Period Passes?
    • RCU readers are not permitted to block, switch to user-mode execution, or enter the idle loop.
    • As soon as a CPU is seen passing through any of these three states, we know that that CPU has exited any previous RCU read-side critical sections.
    • If we remove an item from a linked list, and then wait until all CPUs have switched context, executed in user mode, or executed in the idle loop, we can safely free up that item.
  • 12. Core RCU APIs
    • rcu_read_lock()‏
    • rcu_read_unlock()‏
    • synchronize_rcu()/call_rcu()‏
    • rcu_assign_pointer()‏
    • rcu_dereference()‏
  • 13. Wait for Readers
    • synchronize_rcu(): waits only for all ongoing RCU read-side critical sections to complete
    • call_rcu(): registers a function and argument which are invoked after all ongoing RCU read-side critical sections have completed
  • 14. Assign & Retrieve
    • rcu_assign_pointer(): assign a new value to an RCU-protected pointer
    • rcu_dereference(): fetch an RCU-protected pointer, which is safe to use until rcu_read_unlock()‏
  • 15. RCU List Insert
    • list_add_rcu()
    • list_add_tail_rcu()
    • list_replace_rcu()
    • Must be protected by some locks.
  • 16. Sample Code
    • 1 struct foo {
    • 2 struct list_node *list;
    • 3 int a;
    • 4 int b;
    • 5 int c;
    • 6 };
    • 7 LIST_HEAD(head);
    • 8
    • 9 /* . . . */
    • 10 p = kmalloc(sizeof(*p), GFP_KERNEL);
    • 11 p->a = 1;
    • 12 p->b = 2;
    • 13 p->c = 3;
    • 14 spin_lock(&list_lock);
    • 15 list_add_head_rcu(&p->list, &head);
    • 16 spin_unlock(&list_lock);
  • 17. RCU List Transversal
    • list_for_each_entry_rcu()‏
    • rcu_read_lock() and rcu_read_unlock() must be called, but they never spin or block
    • Allows list_add_rcu() execute concurrently
  • 18. RCU List Removal
    • list_del_rcu() removes element from list. Must be protected by some lock
    • But when to free it?
    • synchronize_rcu() blocks until all read-side critical sections that begin before synchronize_rcu() is completed
    • call_rcu() runs after all read-side critical sections that begin before call_rcu() is completed.
  • 19. Sample Code
    • spin_lock(&mylock);
    • p = search(head, key);
    • if (p == NULL)‏
    • spin_unlock(&mylock);
    • else {
    • list_del_rcu(&p->list);
    • spin_unlock(&mylock);
    • synchronize_rcu();
    • kfree(p);
    • }
  • 20. Sleepable RCU
    • Why?
      • the realtime kernels that require spinlock critical sections be preemptible also require that RCU read-side critical sections be preemptible
  • 21. SRCU Implementation Strategy
    • prevent any given task sleeping in an RCU read-side critical section from getting an unbounded number of RCU callbacks
      • refusing to provide asynchronous grace-period interfaces, such as the Classic RCU's call_rcu() API
      • isolating grace-period detection within each subsystem using SRCU
  • 22. SRCU Grace Period?
    • grace periods are detected by counting per-CPU counters.
      • readers manipulate CPU-local counters.
      • Two sets of per-CPU counters to do read-copy-update
  • 23. SRCU Data Structure
    • struct srcu_struct {
    • int completed;
    • struct srcu_struct_array __percpu *per_cpu_ref;
    • struct mutex mutex;
    • };
    • struct srcu_struct_array {
    • int c[2];
    • };
  • 24. Wait for Grace Period
    • synchronize_srcu()‏
      • Flip the completed counter. So new readers will be using the other set of per-CPU counters.
      • Wait for the old count to drain to zero.
  • 25. SRCU APIs
    • int init_srcu_struct(struct srcu_struct *sp);
    • void cleanup_srcu_struct(struct srcu_struct *sp);
    • int srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
    • void srcu_read_unlock(struct srcu_struct *sp, int idx);
    • void synchronize_srcu(struct srcu_struct *sp);
    • void synchronize_srcu_expedited(struct srcu_struct *sp);
    • long srcu_batches_completed(struct srcu_struct *sp);
  • 26. Userspace RCU
    • Available on http://lttng.org/urcu
    • git clone git://git.lttng.org/userspace-rcu.git
    • Debian: aptitude install liburcu-dev
    • Examples
  • 27. Q & A