SlideShare a Scribd company logo
B+	
  Review	
  
B+	
  Tree:	
  Most	
  Widely	
  Used	
  Index	
  
• Insert/delete	
  at	
  log	
  F	
  N	
  cost;	
  keep	
  tree	
  height-­‐
balanced.	
  	
  	
  (F	
  =	
  fanout,	
  N	
  =	
  #	
  leaf	
  pages)	
  
• Minimum	
  50%	
  occupancy	
  (except	
  for	
  root).	
  	
  Each	
  
node	
  contains	
  d	
  <=	
  	
  m	
  	
  <=	
  2d	
  entries.	
  	
  The	
  
parameter	
  d	
  is	
  called	
  the	
  order	
  of	
  the	
  tree.	
  
• Supports	
  equality	
  and	
  range-­‐searches	
  efficiently.	
  
Index Entries
Data Entries
("Sequence set")
(Direct search)
Example	
  B+	
  Tree	
  
• Search	
  begins	
  at	
  root,	
  and	
  key	
  comparisons	
  
direct	
  it	
  to	
  a	
  leaf	
  (as	
  in	
  ISAM).	
  
• Search	
  for	
  5*,	
  15*,	
  all	
  data	
  entries	
  >=	
  24*	
  ...	
  
☛ Based on the search for 15*, we know it is not in the tree!
Root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
Inser[ng	
  a	
  Data	
  Entry	
  into	
  a	
  B+	
  Tree	
  
• Find	
  correct	
  leaf	
  L.	
  	
  
• Put	
  data	
  entry	
  onto	
  L.	
  
– If	
  L	
  has	
  enough	
  space,	
  done!	
  
– Else,	
  must	
  split	
  	
  L	
  (into	
  L	
  and	
  a	
  new	
  node	
  L2)	
  
• Redistribute	
  entries	
  evenly,	
  copy	
  up	
  middle	
  key.	
  
• Insert	
  index	
  entry	
  poin[ng	
  to	
  L2	
  into	
  parent	
  of	
  L.	
  
• This	
  can	
  happen	
  recursively	
  
– To	
  split	
  index	
  node,	
  redistribute	
  entries	
  evenly,	
  but	
  
push	
  up	
  middle	
  key.	
  	
  (Contrast	
  with	
  leaf	
  splits.)	
  
• Splits	
  “grow”	
  tree;	
  root	
  split	
  increases	
  height.	
  	
  	
  
– Tree	
  growth:	
  gets	
  wider	
  or	
  one	
  level	
  taller	
  at	
  top.	
  
Inser[ng	
  8*	
  into	
  Example	
  B+	
  Tree	
  
• Observe	
  how	
  
minimum	
  
occupancy	
  is	
  
guaranteed	
  in	
  
both	
  leaf	
  and	
  
index	
  pg	
  splits.	
  
• Note	
  difference	
  
between	
  copy-­‐up	
  
and	
  push-­‐up;	
  be	
  
sure	
  you	
  
understand	
  the	
  
reasons	
  for	
  this.	
  
2* 3* 5* 7* 8*
5
Entry to be inserted in parent node.
(Note that 5 is
continues to appear in the leaf.)
s copied up and
appears once in the index. Contrast
5 24 30
17
13
Entry to be inserted in parent node.
(Note that 17 is pushed up and only
this with a leaf split.)
Example	
  B+	
  Tree	
  
• We’re	
  going	
  to	
  insert	
  8.	
  
☛ Based on the search for 15*, we know it is not in the tree!
Root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
Example	
  B+	
  Tree	
  Afer	
  Inser[ng	
  8*	
  
v Notice that root was split, leading to increase in height.
v In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
2* 3*
Root
17
24 30
14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
5
7*
5* 8*
Dele[ng	
  a	
  Data	
  Entry	
  from	
  a	
  B+	
  Tree	
  
• Start	
  at	
  root,	
  find	
  leaf	
  L	
  where	
  entry	
  belongs.	
  
• Remove	
  the	
  entry.	
  
– If	
  L	
  is	
  at	
  least	
  half-­‐full,	
  done!	
  	
  
– If	
  L	
  has	
  only	
  d-­‐1	
  entries,	
  
• Try	
  to	
  re-­‐distribute,	
  borrowing	
  from	
  sibling	
  (adjacent	
  node	
  with	
  
same	
  parent	
  as	
  L).	
  
• If	
  re-­‐distribu[on	
  fails,	
  merge	
  L	
  and	
  sibling.	
  
• If	
  merge	
  occurred,	
  must	
  delete	
  entry	
  (poin[ng	
  to	
  L	
  
or	
  sibling)	
  from	
  parent	
  of	
  L.	
  
• Merge	
  could	
  propagate	
  to	
  root,	
  decreasing	
  height.	
  
Delete	
  
Example	
  B+	
  Tree	
  Afer	
  Inser[ng	
  8*	
  
v We’re going to delete 19 and 20
2* 3*
Root
17
24 30
14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
5
7*
5* 8*
Example	
  Tree	
  Afer	
  (Inser[ng	
  8*,	
  
Then)	
  Dele[ng	
  19*	
  and	
  20*	
  ...	
  
• Dele[ng	
  19*	
  is	
  easy.	
  
• Dele[ng	
  20*	
  is	
  done	
  with	
  re-­‐distribu[on.	
  
No[ce	
  how	
  middle	
  key	
  is	
  copied	
  up.	
  
2* 3*
Root
17
30
14* 16* 33* 34* 38* 39*
13
5
7*
5* 8* 22* 24*
27
27* 29*
Next,	
  we	
  delete	
  24	
  
 	
  	
  	
  	
  	
  	
  	
  ...	
  And	
  Then	
  Dele[ng	
  24*	
  
• Must	
  merge.	
  
• Observe	
  `toss’	
  of	
  
index	
  entry	
  (on	
  right),	
  
and	
  `pull	
  down’	
  of	
  
index	
  entry	
  (below).	
  
30
22* 27* 29* 33* 34* 38* 39*
2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39*
5* 8*
Root
30
13
5 17
Example	
  of	
  Non-­‐leaf	
  Re-­‐distribu[on	
  
• Tree	
  is	
  shown	
  below	
  during	
  dele>on	
  of	
  24*.	
  
(What	
  could	
  be	
  a	
  possible	
  ini[al	
  tree?)	
  
• In	
  contrast	
  to	
  previous	
  example,	
  can	
  re-­‐
distribute	
  entry	
  from	
  lef	
  child	
  of	
  root	
  to	
  right	
  
child.	
  	
  	
   Root
13
5 17 20
22
30
14* 16* 17* 18* 20* 33* 34* 38* 39*
22* 27* 29*
21*
7*
5* 8*
3*
2*
Afer	
  Re-­‐distribu[on	
  
• Entries	
  are	
  re-­‐distributed	
  by	
  `pushing	
  through’	
  
the	
  splilng	
  entry	
  in	
  the	
  parent	
  node.	
  
• It	
  suffices	
  to	
  re-­‐distribute	
  index	
  entry	
  with	
  key	
  
20;	
  we’ve	
  re-­‐distributed	
  17	
  as	
  well	
  
14* 16* 33* 34* 38* 39*
22* 27* 29*
17* 18* 20* 21*
7*
5* 8*
2* 3*
Root
13
5
17
30
20 22
B+	
  Concurrency	
  
Model	
  
• We	
  consider	
  page	
  lock(x)/unlock(x)	
  of	
  pages	
  
(only	
  for	
  writes!)	
  
• We	
  copy	
  into	
  our	
  memory	
  and	
  then	
  atomically	
  
update	
  pages.	
  
Simple	
  Approach	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   10	
   12	
   15	
  
…	
   15	
   …	
  
P1	
  
P1	
  
P2	
  
P2	
  
P2	
  
Afer	
  the	
  Inser[on	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
12	
   15	
  
…	
   10	
   15	
  
P1	
  
P2	
  
8	
   9	
   10	
  
P2	
  
P1	
  
P1	
  Finds	
  no	
  15!	
  
How	
  could	
  we	
  fix	
  this?	
  
B-­‐Link	
  Trees	
  
Two	
  important	
  Conven[ons	
  
• Search	
  for	
  B-­‐link	
  trees	
  root	
  to	
  leaf,	
  lef-­‐to-­‐right	
  
in	
  nodes	
  
• Inser[ons	
  for	
  B-­‐link	
  trees	
  proceed	
  bopom-­‐up.	
  	
  
• Parameter	
  d	
  =	
  the	
  degree	
  
Internal	
  Nodes	
  
30 120 240
Keys	
  k	
  <	
  30	
  
Keys	
  30<=k<120	
   Keys	
  120<=k<240	
  
Keys	
  240<=k	
  
Internal	
  Node	
  has	
  
s	
  >=	
  d	
  and	
  <=	
  2d	
  keys	
  
We	
  add	
  a	
  High	
  key	
  
280	
  
Keys	
  240<=280	
  
Add	
  right	
  pointers.	
  
Idea:	
  If	
  we	
  get	
  to	
  this	
  page,	
  looking	
  for	
  
300.	
  What	
  can	
  we	
  conclude	
  happened?	
  
Valid	
  Trees	
  &	
  Safe	
  Nodes	
  
• A	
  node	
  may	
  not	
  have	
  a	
  parent	
  node,	
  but	
  it	
  
must	
  have	
  a	
  lef	
  twin.	
  
• We	
  introduce	
  the	
  right	
  links	
  before	
  the	
  
parent.	
  
• A	
  node	
  is	
  safe	
  if	
  it	
  has	
  [k,2k-­‐1]	
  pointers.	
  
Scannode	
  
scannode(u,	
  	
  A)	
  :	
  examine	
  the	
  	
  tree	
  	
  node	
  	
  in	
  A	
  	
  for	
  	
  
value	
  	
  u	
  	
  and	
  	
  return	
  	
  the	
  	
  appropriate	
  	
  pointer	
  	
  
from	
  	
  A.	
  
	
  Appropriate	
  pointer	
  may	
  be	
  the	
  right	
  pointer.	
  	
  
Searching	
  for	
  v	
  
current	
  =	
  root;	
  
A	
  =	
  get(current);	
  
while	
  (current	
  is	
  not	
  a	
  leaf)	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  current	
  =	
  scannode(v,	
  A);	
  
	
  	
  	
  	
  	
  	
  	
  	
  A	
  =	
  get(current);}	
  
while	
  ((t	
  =	
  scannode(v,A))	
  ==	
  link	
  pointer	
  of	
  A)	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  current	
  =	
  t;	
  
	
  	
  	
  	
  	
  	
  	
  	
  A	
  =	
  get(current);}	
  
Return	
  (v	
  is	
  in	
  A)	
  ?	
  success	
  :	
  	
  failure;	
  
Find	
  the	
  leaf	
  w/	
  v	
  
Find	
  the	
  leaf	
  w/	
  v	
  
Only	
  modify	
  scannode	
  –	
  No	
  locking?!?	
  
Insert	
  
Revised	
  Approach	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   10	
   12	
   15	
  
…	
   15	
   …	
  
P1	
  
P1	
  
P2	
  
P2	
  
P2	
  
High	
  Key	
  Omiped	
  	
  
Revised	
  Approach:	
  Build	
  new	
  page	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   10	
   12	
   15	
  
…	
   15	
   …	
  
P1	
  
P2	
  
12	
   15	
  
Revised	
  Approach:	
  Build	
  new	
  page	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   9	
   10	
  
…	
   15	
   …	
  
P2	
  
12	
   15	
  
P1	
  
How	
  did	
  P1	
  know	
  
to	
  con[nue?	
  
Start	
  Insert	
  
ini[alize	
  stack;	
  current	
  =	
  root;	
  
A	
  =	
  get(current);	
  
while	
  (current	
  is	
  not	
  a	
  leaf)	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  t	
  =	
  current;	
  
	
  	
  	
  	
  	
  	
  	
  	
  current	
  =	
  scannode(v,A);	
  
	
  	
  	
  	
  	
  	
  	
  	
  if	
  (current	
  not	
  link	
  pointer	
  in	
  A)	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  push	
  t;	
  
	
  	
  	
  	
  	
  	
  	
  	
  A	
  =	
  get(current);}	
  
Keep	
  a	
  stack	
  of	
  the	
  
rightmost	
  node	
  we	
  
visited	
  at	
  each	
  level:	
  	
  
A	
  subrou[ne:	
  move_right	
  
While	
  t	
  =	
  scannode(v,A)	
  is	
  a	
  link	
  pointer	
  of	
  A	
  do	
  
	
  Lock(t)	
  
	
  	
  	
  	
  Unlock(current)	
  
	
  	
  	
  	
  Current	
  =	
  t	
  
	
  	
  	
  	
  A	
  =	
  get(current);	
  
end	
  
	
   The	
  move_right	
  procedure	
  scans	
  right	
  
across	
  the	
  leaves	
  with	
  lock	
  coupling.	
  	
  
How	
  many	
  locks	
  held	
  
here?	
  
Easy	
  case:	
  
DoInsert:	
  
	
  
if	
  A	
  is	
  safe	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  insert	
  new	
  key/ptr	
  pair	
  on	
  A;	
  
	
  	
  	
  	
  	
  	
  	
  	
  put(A,	
  current);	
  
	
  	
  	
  	
  	
  	
  	
  	
  unlock(current);	
  
}	
  
Fun	
  Case:	
  Must	
  split	
  
u	
  =	
  allocate(1	
  new	
  page	
  for	
  B);	
  
redistribute	
  A	
  over	
  A	
  and	
  B	
  ;	
  
y	
  =	
  max	
  value	
  on	
  A	
  now;	
  
make	
  high	
  key	
  of	
  B	
  equal	
  old	
  high	
  key	
  of	
  A;	
  
make	
  right-­‐link	
  of	
  B	
  equal	
  old	
  right-­‐link	
  of	
  A;	
  
make	
  high	
  key	
  of	
  A	
  equal	
  y;	
  
make	
  right-­‐link	
  of	
  A	
  point	
  to	
  B;	
  	
  	
  	
  	
  	
  	
  
Insert	
  
put	
  (B,	
  u);	
  
put	
  (A,	
  current);	
  
oldnode	
  =	
  current;	
  
new	
  key/ptr	
  pair	
  =	
  (y,	
  u);	
  //	
  high	
  key	
  of	
  new	
  page,	
  
new	
  page	
  
current	
  =	
  pop(stack);	
  
lock(current);	
  A	
  =	
  get(current);	
  
move_right();	
  	
  
unlock(oldnode)	
  
goto	
  Doinser[on;	
  
may	
  have	
  3	
  locks:	
  oldnode,	
  and	
  
two	
  at	
  the	
  parent	
  level	
  while	
  
moving	
  right	
  
Deadlock	
  Free	
  
Total	
  Order	
  <	
  on	
  Nodes	
  
Consider	
  pages	
  a,b	
  	
  define	
  a	
  total	
  order	
  <	
  
1. a	
  <	
  b	
  if	
  b	
  is	
  closer	
  to	
  the	
  root	
  than	
  a	
  (different	
  
height)	
  
2. If	
  a	
  and	
  b	
  are	
  at	
  the	
  same	
  height,	
  then	
  a	
  <	
  b	
  if	
  
b	
  is	
  reachable.	
  
Observa[on:	
  Insert	
  process	
  only	
  puts	
  down	
  locks	
  
sa[sfying	
  this	
  order.	
  Why	
  is	
  this	
  true?	
  
“Order	
  is	
  bopom-­‐up”	
  
Deadlock	
  Free	
  
Since	
  the	
  locks	
  are	
  placed	
  by	
  every	
  process	
  in	
  a	
  
total	
  order,	
  there	
  can	
  be	
  no	
  deadlock.	
  Why?	
  
	
  Is	
  it	
  possible	
  to	
  get	
  the	
  cycle:	
  
T1(A)	
  T2(B)	
  T1(B)	
  T2(A)?	
  
Tree	
  Modifica[on	
  
Tree	
  Modifica[ons	
  
Thm:	
  All	
  opera[ons	
  correctly	
  modify	
  the	
  tree	
  
structure.	
  
Observa[on	
  1:	
  put(B,u)	
  and	
  put(A,	
  current)	
  are	
  one	
  opera[on	
  
(since	
  put(B,u)	
  doesn’t	
  change	
  tree.	
  Proof	
  by	
  pictures	
  (again).	
  
Revised	
  Approach:	
  Build	
  new	
  page	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   10	
   12	
   15	
  
…	
   15	
   …	
  
P1	
  
P2	
  
12	
   15	
  
Revised	
  Approach:	
  Build	
  new	
  page	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   9	
   10	
  
…	
   15	
   …	
  
P2	
  
12	
   15	
  
P1	
  
How	
  did	
  P1	
  know	
  
to	
  con[nue?	
  
Correct	
  Interac[on	
  of	
  	
  
Readers	
  and	
  Writers	
  
Correct	
  Interac[on	
  
Thm:	
  Ac[ons	
  of	
  an	
  inser[on	
  process	
  do	
  not	
  impair	
  
the	
  correctness	
  of	
  the	
  ac[ons	
  of	
  other	
  processes.	
  
Type	
  1:	
  No	
  split	
  
• P1	
  searches	
  for	
  15	
  
• P2	
  inserts	
  9	
  
8	
   10	
   15	
  
…	
   15	
   …	
  
P1	
  
8	
   9	
   10	
   15	
  
P2	
  
P2	
  reads	
  the	
  page.	
  
	
  What	
  schedule	
  is	
  this?	
  	
  
Why	
  can’t	
  P1,P2	
  conflict	
  again?	
  
What	
  if	
  P2	
  reads	
  afer	
  P1?	
  	
  
Type	
  2:	
  Split.	
  insert	
  into	
  lef	
  Node	
  
Type	
  2:	
  Split.	
  Insert	
  LHS.	
  
• P1	
  searches	
  for	
  8	
  
• P2	
  inserts	
  9	
  
7	
   10	
   12	
   15	
  
…	
   15	
   …	
  
P1	
  
P2	
  
12	
   15	
  
No[ce	
  that	
  P1	
  would	
  have	
  
followed	
  9s	
  pointer!	
   How	
  will	
  P1	
  find	
  8?	
  
Livelock	
  
P4	
  
P5	
  
P6	
  
Livelock	
  problem	
  
P2	
   P3	
  
P1	
  
Poor	
  P1	
  never	
  gets	
  its	
  value!	
  
P1	
  is	
  livelocked!	
  
Chaining	
  Example	
  
Can	
  we	
  get	
  down	
  below	
  3	
  locks?	
  
read	
  A;	
  
find	
  out	
  that	
  there	
  is	
  room;	
  
lock	
  and	
  re-­‐read	
  A;	
  
find	
  there	
  is	
  s[ll	
  room,	
  and	
  insert	
  9	
  
unlock	
  A;	
  
Consider	
  the	
  Alterna[ve	
  Protocol	
  	
  
(without	
  lock	
  coupling)	
  
5	
   6	
   12	
   15	
  
Large	
  #	
  of	
  inserts.	
  A	
  splits	
  
and	
  afer	
  there	
  is	
  room!	
  
A	
  
What	
  prevents	
  this	
  in	
  Blink?	
  
Further	
  Reading	
  
• Recent	
  HP	
  Tech	
  Report	
  is	
  great	
  source	
  
(Graefe)	
  
• Extensions:	
  R-­‐trees	
  and	
  GiST	
  
hpp://www.hpl.hp.com/techreports/2010/
HPL-­‐2010-­‐9.pdf	
  
Marcel	
  Kornacker,	
  Douglas	
  Banks:	
  High-­‐Concurrency	
  
Locking	
  in	
  R-­‐Trees.	
  VLDB	
  1995:	
  134-­‐145	
  
Marcel	
  Kornacker,	
  C.	
  Mohan,	
  Joseph	
  M.	
  Hellerstein:	
  
Concurrency	
  and	
  Recovery	
  in	
  Generalized	
  Search	
  
Trees.	
  SIGMOD	
  Conference	
  1997:	
  62-­‐72	
  

More Related Content

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

BTreeReview.pdf

  • 2. B+  Tree:  Most  Widely  Used  Index   • Insert/delete  at  log  F  N  cost;  keep  tree  height-­‐ balanced.      (F  =  fanout,  N  =  #  leaf  pages)   • Minimum  50%  occupancy  (except  for  root).    Each   node  contains  d  <=    m    <=  2d  entries.    The   parameter  d  is  called  the  order  of  the  tree.   • Supports  equality  and  range-­‐searches  efficiently.   Index Entries Data Entries ("Sequence set") (Direct search)
  • 3. Example  B+  Tree   • Search  begins  at  root,  and  key  comparisons   direct  it  to  a  leaf  (as  in  ISAM).   • Search  for  5*,  15*,  all  data  entries  >=  24*  ...   ☛ Based on the search for 15*, we know it is not in the tree! Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13
  • 4. Inser[ng  a  Data  Entry  into  a  B+  Tree   • Find  correct  leaf  L.     • Put  data  entry  onto  L.   – If  L  has  enough  space,  done!   – Else,  must  split    L  (into  L  and  a  new  node  L2)   • Redistribute  entries  evenly,  copy  up  middle  key.   • Insert  index  entry  poin[ng  to  L2  into  parent  of  L.   • This  can  happen  recursively   – To  split  index  node,  redistribute  entries  evenly,  but   push  up  middle  key.    (Contrast  with  leaf  splits.)   • Splits  “grow”  tree;  root  split  increases  height.       – Tree  growth:  gets  wider  or  one  level  taller  at  top.  
  • 5. Inser[ng  8*  into  Example  B+  Tree   • Observe  how   minimum   occupancy  is   guaranteed  in   both  leaf  and   index  pg  splits.   • Note  difference   between  copy-­‐up   and  push-­‐up;  be   sure  you   understand  the   reasons  for  this.   2* 3* 5* 7* 8* 5 Entry to be inserted in parent node. (Note that 5 is continues to appear in the leaf.) s copied up and appears once in the index. Contrast 5 24 30 17 13 Entry to be inserted in parent node. (Note that 17 is pushed up and only this with a leaf split.)
  • 6. Example  B+  Tree   • We’re  going  to  insert  8.   ☛ Based on the search for 15*, we know it is not in the tree! Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13
  • 7. Example  B+  Tree  Afer  Inser[ng  8*   v Notice that root was split, leading to increase in height. v In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice. 2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8*
  • 8. Dele[ng  a  Data  Entry  from  a  B+  Tree   • Start  at  root,  find  leaf  L  where  entry  belongs.   • Remove  the  entry.   – If  L  is  at  least  half-­‐full,  done!     – If  L  has  only  d-­‐1  entries,   • Try  to  re-­‐distribute,  borrowing  from  sibling  (adjacent  node  with   same  parent  as  L).   • If  re-­‐distribu[on  fails,  merge  L  and  sibling.   • If  merge  occurred,  must  delete  entry  (poin[ng  to  L   or  sibling)  from  parent  of  L.   • Merge  could  propagate  to  root,  decreasing  height.  
  • 10. Example  B+  Tree  Afer  Inser[ng  8*   v We’re going to delete 19 and 20 2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8*
  • 11. Example  Tree  Afer  (Inser[ng  8*,   Then)  Dele[ng  19*  and  20*  ...   • Dele[ng  19*  is  easy.   • Dele[ng  20*  is  done  with  re-­‐distribu[on.   No[ce  how  middle  key  is  copied  up.   2* 3* Root 17 30 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 24* 27 27* 29* Next,  we  delete  24  
  • 12.                ...  And  Then  Dele[ng  24*   • Must  merge.   • Observe  `toss’  of   index  entry  (on  right),   and  `pull  down’  of   index  entry  (below).   30 22* 27* 29* 33* 34* 38* 39* 2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39* 5* 8* Root 30 13 5 17
  • 13. Example  of  Non-­‐leaf  Re-­‐distribu[on   • Tree  is  shown  below  during  dele>on  of  24*.   (What  could  be  a  possible  ini[al  tree?)   • In  contrast  to  previous  example,  can  re-­‐ distribute  entry  from  lef  child  of  root  to  right   child.       Root 13 5 17 20 22 30 14* 16* 17* 18* 20* 33* 34* 38* 39* 22* 27* 29* 21* 7* 5* 8* 3* 2*
  • 14. Afer  Re-­‐distribu[on   • Entries  are  re-­‐distributed  by  `pushing  through’   the  splilng  entry  in  the  parent  node.   • It  suffices  to  re-­‐distribute  index  entry  with  key   20;  we’ve  re-­‐distributed  17  as  well   14* 16* 33* 34* 38* 39* 22* 27* 29* 17* 18* 20* 21* 7* 5* 8* 2* 3* Root 13 5 17 30 20 22
  • 16. Model   • We  consider  page  lock(x)/unlock(x)  of  pages   (only  for  writes!)   • We  copy  into  our  memory  and  then  atomically   update  pages.  
  • 17. Simple  Approach   • P1  searches  for  15   • P2  inserts  9   8   10   12   15   …   15   …   P1   P1   P2   P2   P2  
  • 18. Afer  the  Inser[on   • P1  searches  for  15   • P2  inserts  9   12   15   …   10   15   P1   P2   8   9   10   P2   P1   P1  Finds  no  15!   How  could  we  fix  this?  
  • 20. Two  important  Conven[ons   • Search  for  B-­‐link  trees  root  to  leaf,  lef-­‐to-­‐right   in  nodes   • Inser[ons  for  B-­‐link  trees  proceed  bopom-­‐up.    
  • 21. • Parameter  d  =  the  degree   Internal  Nodes   30 120 240 Keys  k  <  30   Keys  30<=k<120   Keys  120<=k<240   Keys  240<=k   Internal  Node  has   s  >=  d  and  <=  2d  keys   We  add  a  High  key   280   Keys  240<=280   Add  right  pointers.   Idea:  If  we  get  to  this  page,  looking  for   300.  What  can  we  conclude  happened?  
  • 22. Valid  Trees  &  Safe  Nodes   • A  node  may  not  have  a  parent  node,  but  it   must  have  a  lef  twin.   • We  introduce  the  right  links  before  the   parent.   • A  node  is  safe  if  it  has  [k,2k-­‐1]  pointers.  
  • 23. Scannode   scannode(u,    A)  :  examine  the    tree    node    in  A    for     value    u    and    return    the    appropriate    pointer     from    A.    Appropriate  pointer  may  be  the  right  pointer.    
  • 24. Searching  for  v   current  =  root;   A  =  get(current);   while  (current  is  not  a  leaf)  {                  current  =  scannode(v,  A);                  A  =  get(current);}   while  ((t  =  scannode(v,A))  ==  link  pointer  of  A)  {                  current  =  t;                  A  =  get(current);}   Return  (v  is  in  A)  ?  success  :    failure;   Find  the  leaf  w/  v   Find  the  leaf  w/  v   Only  modify  scannode  –  No  locking?!?  
  • 26. Revised  Approach   • P1  searches  for  15   • P2  inserts  9   8   10   12   15   …   15   …   P1   P1   P2   P2   P2   High  Key  Omiped    
  • 27. Revised  Approach:  Build  new  page   • P1  searches  for  15   • P2  inserts  9   8   10   12   15   …   15   …   P1   P2   12   15  
  • 28. Revised  Approach:  Build  new  page   • P1  searches  for  15   • P2  inserts  9   8   9   10   …   15   …   P2   12   15   P1   How  did  P1  know   to  con[nue?  
  • 29. Start  Insert   ini[alize  stack;  current  =  root;   A  =  get(current);   while  (current  is  not  a  leaf)  {                  t  =  current;                  current  =  scannode(v,A);                  if  (current  not  link  pointer  in  A)                                  push  t;                  A  =  get(current);}   Keep  a  stack  of  the   rightmost  node  we   visited  at  each  level:    
  • 30. A  subrou[ne:  move_right   While  t  =  scannode(v,A)  is  a  link  pointer  of  A  do    Lock(t)          Unlock(current)          Current  =  t          A  =  get(current);   end     The  move_right  procedure  scans  right   across  the  leaves  with  lock  coupling.     How  many  locks  held   here?  
  • 31. Easy  case:   DoInsert:     if  A  is  safe  {                  insert  new  key/ptr  pair  on  A;                  put(A,  current);                  unlock(current);   }  
  • 32. Fun  Case:  Must  split   u  =  allocate(1  new  page  for  B);   redistribute  A  over  A  and  B  ;   y  =  max  value  on  A  now;   make  high  key  of  B  equal  old  high  key  of  A;   make  right-­‐link  of  B  equal  old  right-­‐link  of  A;   make  high  key  of  A  equal  y;   make  right-­‐link  of  A  point  to  B;              
  • 33. Insert   put  (B,  u);   put  (A,  current);   oldnode  =  current;   new  key/ptr  pair  =  (y,  u);  //  high  key  of  new  page,   new  page   current  =  pop(stack);   lock(current);  A  =  get(current);   move_right();     unlock(oldnode)   goto  Doinser[on;   may  have  3  locks:  oldnode,  and   two  at  the  parent  level  while   moving  right  
  • 35. Total  Order  <  on  Nodes   Consider  pages  a,b    define  a  total  order  <   1. a  <  b  if  b  is  closer  to  the  root  than  a  (different   height)   2. If  a  and  b  are  at  the  same  height,  then  a  <  b  if   b  is  reachable.   Observa[on:  Insert  process  only  puts  down  locks   sa[sfying  this  order.  Why  is  this  true?   “Order  is  bopom-­‐up”  
  • 36. Deadlock  Free   Since  the  locks  are  placed  by  every  process  in  a   total  order,  there  can  be  no  deadlock.  Why?    Is  it  possible  to  get  the  cycle:   T1(A)  T2(B)  T1(B)  T2(A)?  
  • 38. Tree  Modifica[ons   Thm:  All  opera[ons  correctly  modify  the  tree   structure.   Observa[on  1:  put(B,u)  and  put(A,  current)  are  one  opera[on   (since  put(B,u)  doesn’t  change  tree.  Proof  by  pictures  (again).  
  • 39. Revised  Approach:  Build  new  page   • P1  searches  for  15   • P2  inserts  9   8   10   12   15   …   15   …   P1   P2   12   15  
  • 40. Revised  Approach:  Build  new  page   • P1  searches  for  15   • P2  inserts  9   8   9   10   …   15   …   P2   12   15   P1   How  did  P1  know   to  con[nue?  
  • 41. Correct  Interac[on  of     Readers  and  Writers  
  • 42. Correct  Interac[on   Thm:  Ac[ons  of  an  inser[on  process  do  not  impair   the  correctness  of  the  ac[ons  of  other  processes.  
  • 43. Type  1:  No  split   • P1  searches  for  15   • P2  inserts  9   8   10   15   …   15   …   P1   8   9   10   15   P2   P2  reads  the  page.    What  schedule  is  this?     Why  can’t  P1,P2  conflict  again?   What  if  P2  reads  afer  P1?    
  • 44. Type  2:  Split.  insert  into  lef  Node  
  • 45. Type  2:  Split.  Insert  LHS.   • P1  searches  for  8   • P2  inserts  9   7   10   12   15   …   15   …   P1   P2   12   15   No[ce  that  P1  would  have   followed  9s  pointer!   How  will  P1  find  8?  
  • 47. P4   P5   P6   Livelock  problem   P2   P3   P1   Poor  P1  never  gets  its  value!   P1  is  livelocked!  
  • 49. Can  we  get  down  below  3  locks?   read  A;   find  out  that  there  is  room;   lock  and  re-­‐read  A;   find  there  is  s[ll  room,  and  insert  9   unlock  A;   Consider  the  Alterna[ve  Protocol     (without  lock  coupling)   5   6   12   15   Large  #  of  inserts.  A  splits   and  afer  there  is  room!   A   What  prevents  this  in  Blink?  
  • 50. Further  Reading   • Recent  HP  Tech  Report  is  great  source   (Graefe)   • Extensions:  R-­‐trees  and  GiST   hpp://www.hpl.hp.com/techreports/2010/ HPL-­‐2010-­‐9.pdf   Marcel  Kornacker,  Douglas  Banks:  High-­‐Concurrency   Locking  in  R-­‐Trees.  VLDB  1995:  134-­‐145   Marcel  Kornacker,  C.  Mohan,  Joseph  M.  Hellerstein:   Concurrency  and  Recovery  in  Generalized  Search   Trees.  SIGMOD  Conference  1997:  62-­‐72