Highlighted notes of:
Chapter 11: Concurrent Stacks and Elimination
Book:
The Art of Multiprocessor Programming
Authors:
Maurice Herlihy
Nir Shavit
Maurice Herlihy has an A.B. in Mathematics from Harvard University, and a Ph.D. in Computer Science from M.I.T. He has served on the faculty of Carnegie Mellon University and the staff of DEC Cambridge Research Lab. He is the recipient of the 2003 Dijkstra Prize in Distributed Computing, the 2004 Gödel Prize in theoretical computer science, the 2008 ISCA influential paper award, the 2012 Edsger W. Dijkstra Prize, and the 2013 Wallace McDowell award. He received a 2012 Fulbright Distinguished Chair in the Natural Sciences and Engineering Lecturing Fellowship, and he is fellow of the ACM, a fellow of the National Academy of Inventors, the National Academy of Engineering, and the National Academy of Arts and Sciences.
Nir Shavit received B.Sc. and M.Sc. degrees in Computer Science from the Technion - Israel Institute of Technology in 1984 and 1986, and a Ph.D. in Computer Science from the Hebrew University of Jerusalem in 1990. Shavit is a co-author of the book The Art of Multiprocessor Programming. He is a recipient of the 2004 Gödel Prize in theoretical computer science for his work on applying tools from algebraic topology to model shared memory computability and of the 2012 Dijkstra Prize in Distributed Computing for the introduction of Software Transactional Memory. He is a past program chair of the ACM Symposium on Principles of Distributed Computing (PODC) and the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). His current research covers techniques for desinging scalable software for multiprocessors, in particular concurrent data structures for multicore machines.
Control statements allow programs to select different execution paths based on conditions or iterate through loops. Common control statements in Java include if/else for conditional branching, while, do-while and for for iterative loops, and break, continue, return for jumping execution.
Classes are templates that define the form and behavior of objects. A class contains instance variables to represent object state and methods to implement object behavior. Objects are instances of classes that allocate memory at runtime. Methods allow classes to encapsulate and reuse code. Constructors initialize new objects and this keyword refers to the current object instance. Garbage collection automatically reclaims unused memory from objects no longer referenced.
The Ring programming language version 1.10 book - Part 126 of 212Mahmoud Samir Fayed
The document summarizes Ring functions for string manipulation. It provides the syntax, description, arguments, and return values for over 30 functions, including frAbs(), frAt(), frChr(), frEmpty(), frFile(), frLen(), frLTrim(), frPadL(), frProper(), frReplicate(), frRTrim(), frSpace(), frStr(), frStrTran(), and frSubStr(). These functions allow manipulating different parts of strings, padding strings, searching strings, and converting between strings and other data types.
chap4 ; Flow Control, Assertions, and Exception Handling (scjp/ocjp)It Academy
The document discusses different types of loops in Java including while, do-while, for, and enhanced for loops. It explains the syntax and usage of each loop type with examples. Key loop concepts covered include boolean expressions for conditions, loop variables, continue and break statements. The document also discusses if/else and switch conditional statements in Java.
The document discusses various programming concepts in C# such as data types, control flow statements like if/else, switch, loops (for, while, do-while), methods for generating random numbers, and examples of programming problems involving calculations, patterns, and simulating dice rolls. Random numbers can be generated using the Random class, control structures conditionally execute blocks of code, and loops iterate code for a set number of repetitions.
The document discusses various programming concepts in C# such as random number generation, Boolean expressions, control statements like if/else, switch, for, while and do-while loops. It also covers nested loops, break and continue statements, and provides examples of problems that can be solved using these programming constructs like generating random numbers, printing patterns, and simulating dice rolls.
This document discusses control flow and control structures in programming, including conditional structures like if/else statements and repetition structures like while and for loops.
It provides examples of simple if statements, if/else statements, if/elif/else statements, and nested if statements. It also discusses indefinite loops with while and definite loops with for, and how to break and continue loops.
Finally, it covers pass and return statements, local and global scopes, function composition, lambda functions, string slicing, string methods, and that strings are immutable in Python.
1) The document provides an introduction to programming with the maXbox tool. It explains the interface and how to load sample code.
2) The sample code checks if numbers are prime by using a function called checkPrim() within a procedure called TestPrimNumbers(). It saves the results to a file.
3) The main routine calls TestPrimNumbers(), saves the output list to a file, loads the file, and displays the results along with performance times. This demonstrates functions, procedures, file I/O, and other basic programming concepts.
Loops in Lisp provides an overview of loops in Lisp, including loop syntax and constructs. The key points are:
- A loop is a series of expressions executed repeatedly through iteration. The loop macro drives the loop facility.
- Loops consist of clauses that specify variables, accumulation, conditions, execution control and more. Clauses execute in the order specified.
- Common loop constructs initialize variables, step variables between iterations, perform termination tests, and conditionally or unconditionally execute code.
- Loops can initialize and accumulate values, control iteration through conditions, and execute code before, during or after the loop.
Control statements allow programs to select different execution paths based on conditions or iterate through loops. Common control statements in Java include if/else for conditional branching, while, do-while and for for iterative loops, and break, continue, return for jumping execution.
Classes are templates that define the form and behavior of objects. A class contains instance variables to represent object state and methods to implement object behavior. Objects are instances of classes that allocate memory at runtime. Methods allow classes to encapsulate and reuse code. Constructors initialize new objects and this keyword refers to the current object instance. Garbage collection automatically reclaims unused memory from objects no longer referenced.
The Ring programming language version 1.10 book - Part 126 of 212Mahmoud Samir Fayed
The document summarizes Ring functions for string manipulation. It provides the syntax, description, arguments, and return values for over 30 functions, including frAbs(), frAt(), frChr(), frEmpty(), frFile(), frLen(), frLTrim(), frPadL(), frProper(), frReplicate(), frRTrim(), frSpace(), frStr(), frStrTran(), and frSubStr(). These functions allow manipulating different parts of strings, padding strings, searching strings, and converting between strings and other data types.
chap4 ; Flow Control, Assertions, and Exception Handling (scjp/ocjp)It Academy
The document discusses different types of loops in Java including while, do-while, for, and enhanced for loops. It explains the syntax and usage of each loop type with examples. Key loop concepts covered include boolean expressions for conditions, loop variables, continue and break statements. The document also discusses if/else and switch conditional statements in Java.
The document discusses various programming concepts in C# such as data types, control flow statements like if/else, switch, loops (for, while, do-while), methods for generating random numbers, and examples of programming problems involving calculations, patterns, and simulating dice rolls. Random numbers can be generated using the Random class, control structures conditionally execute blocks of code, and loops iterate code for a set number of repetitions.
The document discusses various programming concepts in C# such as random number generation, Boolean expressions, control statements like if/else, switch, for, while and do-while loops. It also covers nested loops, break and continue statements, and provides examples of problems that can be solved using these programming constructs like generating random numbers, printing patterns, and simulating dice rolls.
This document discusses control flow and control structures in programming, including conditional structures like if/else statements and repetition structures like while and for loops.
It provides examples of simple if statements, if/else statements, if/elif/else statements, and nested if statements. It also discusses indefinite loops with while and definite loops with for, and how to break and continue loops.
Finally, it covers pass and return statements, local and global scopes, function composition, lambda functions, string slicing, string methods, and that strings are immutable in Python.
1) The document provides an introduction to programming with the maXbox tool. It explains the interface and how to load sample code.
2) The sample code checks if numbers are prime by using a function called checkPrim() within a procedure called TestPrimNumbers(). It saves the results to a file.
3) The main routine calls TestPrimNumbers(), saves the output list to a file, loads the file, and displays the results along with performance times. This demonstrates functions, procedures, file I/O, and other basic programming concepts.
Loops in Lisp provides an overview of loops in Lisp, including loop syntax and constructs. The key points are:
- A loop is a series of expressions executed repeatedly through iteration. The loop macro drives the loop facility.
- Loops consist of clauses that specify variables, accumulation, conditions, execution control and more. Clauses execute in the order specified.
- Common loop constructs initialize variables, step variables between iterations, perform termination tests, and conditionally or unconditionally execute code.
- Loops can initialize and accumulate values, control iteration through conditions, and execute code before, during or after the loop.
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...Philip Schwarz
The document discusses implementing aggregation functions inductively using recursion or folding. It explains that aggregation functions like sum, count, max can be defined inductively with a base case for empty sequences and a recursive step to process additional elements. Implementing functions recursively risks stack overflow for long inputs, but tail-recursion and accumulator tricks can help avoid this. Folds provide an alternative to recursion for defining aggregations by processing sequences from left to right or right to left.
The Ring programming language version 1.8 book - Part 24 of 202Mahmoud Samir Fayed
The document provides documentation on control structures in Ring including branching, looping, and exceptions. It covers if/elseif/else statements, switch statements, while loops, for loops, for-in loops, try/catch blocks, and various loop commands like exit, loop, and next. Examples are given for each control structure to demonstrate their syntax and usage. Key topics include short-circuit evaluation, using control structures inside functions, and modifying lists within for-in loops.
The document discusses various Java code snippets that test different aspects of conditional expressions and data types in Java. It provides the code snippets, expected output, and in some cases explains why the expected output is what it is based on Java's rules for determining the type of a conditional expression.
For loops allow code to be repeatedly executed until a condition is met. They include initialization, condition, and update statements. While and do-while loops also repeatedly execute code until a condition is met, but check the condition at the start or end of each iteration respectively. Loops are useful for tasks like adding numbers in a range or populating data structures to avoid repetitive code.
(download for flawless quality) State Monad - Learn how it works - Follow Alvin Alexander’s example-driven build up to the State Monad and then branch off into a detailed look at its inner workings.
This document contains 24 points summarizing key Java concepts such as keywords and reserved words, memory allocation parameters, garbage collection, wrapper classes, arrays, static imports, variable types, compilation units, operator precedence, conversions and casting, switch statements, enumeration, and more. It was prepared by Farag Zakaria Safy Saad as a study guide for the Java programming language.
This document provides instructions for implementing a basic four-function calculator using stacks and queues. It describes converting an infix expression to postfix notation, evaluating the postfix expression, and handling parentheses. Code is provided to get started, and the tasks are to complete functions to convert infix to postfix, evaluate postfix expressions, and update infix to postfix to handle parentheses. Test cases should be run to demonstrate the functionality.
This chapter discusses various string manipulation techniques in C, including copying strings, comparing strings, removing trailing/leading spaces, padding strings, and copying portions of strings. The standard C library provides functions like strcpy(), strcmp(), rtrim(), printf(), and memcpy() that are useful for string manipulation tasks. Examples are provided to demonstrate how to implement string operations like removing spaces, justifying text, and copying substrings.
The Ring programming language version 1.9 book - Part 124 of 210Mahmoud Samir Fayed
The document summarizes Ring documentation for various string and file handling functions in Ring including frFile(), frFileToStr(), frStr(), frSetIfEmpty(), frSpace(), frInList(), frForcePath(), frJustExt(), frJustStem(), frJustPath(), frForceExt(), frALines(), frJustFName(), frPadL(), frPadR(), frProper(), frReplicate(), frLen(), frStuff(), frSubStr(), frStrTran(), frListToString(), frInt(), frStringToList(), frIIf(), and frVal(). Each function is described in one sentence noting its purpose and key arguments.
Indus Valley Partner aptitude questions and answersSushant Choudhary
This document contains several code snippets and questions about Java programming concepts like arrays, exceptions, interfaces, and access modifiers. The correct answers and explanations provided indicate that:
1) Option B is the only valid way to declare and initialize an array with multiple elements in one statement.
2) Interface methods can only use the public access modifier and cannot be static, final, private, protected, transient, volatile, or synchronized.
3) The default values for array elements of primitive types are 0 for int, 0.0f for float, and null for reference types like String. For non-primitive types like Dog, the default is also null.
The Ring programming language version 1.8 book - Part 117 of 202Mahmoud Samir Fayed
This document summarizes functions in the Ring programming language for working with files, strings, lists, and numeric values. It describes over 30 functions, including frFile() to check if a file exists, frFileToStr() to get a file's contents as a string, frStr() to convert a number to a string, and frIIf() to conditionally return one of two values. The functions take various arguments like file names, values, and lists and return processed strings, numbers, logical values, or other data types.
non-strict functions, bottom and scala by-name parametersPhilip Schwarz
Download for perfect quality.
Non strict functions, bottom and scala by-name parameters - ‘a close look’, through the work of Runar Bjarnason, Paul Chiusano, Martin Odersky, Bill Venners, Lex Spoon, Alvin Alexander, Mark Lewis and Aleksandar Prokopec.
The Ring programming language version 1.5.1 book - Part 18 of 180Mahmoud Samir Fayed
The document discusses various control structures in Ring including if/elseif/else statements, switch statements, while loops, for loops, and exceptions handling. Examples are provided to demonstrate the syntax and usage of if/elseif/else, switch, while, for, and try/catch structures. Looping constructs like for and while can be used to iterate over a range or list. Logical and short-circuit evaluation rules for conditional expressions are also covered.
I am Gill K. I am an Operating System Assignment Expert at programminghomeworkhelp.com. I hold a PhD. in Programming at Manchester University, UK. I have been helping students with their homework for the past 6 years. I solve assignments related to Operating System Assignment.
Visit programminghomeworkhelp.com or email support@programminghomeworkhelp.com.
You can also call on +1 678 648 4277 for any assistance with Operating System Assignment.
The document discusses new features introduced in Java 8, including lambda expressions, functional interfaces, default and static methods in interfaces, and predefined functional interfaces like Predicate, Function, Consumer, and Supplier. It also covers stream API concepts like filtering, mapping, and processing streams using methods like collect(), count(), sorted(), min(), max(), forEach(), and toArray(). Stream API allows processing collections in a declarative way using lambda expressions and method references.
Linked List Static and Dynamic Memory AllocationProf Ansari
Static variables are declared and named while writing the program. (Space for them exists as long as the program, in which they are declared, is running.) Static variables cannot be created or destroyed during execution of the program in which they are declared.
Dynamic variables are created (and may be destroyed) during program execution since dynamic variables do not exist while the program is compiled, but only when it is run, they cannot be assigned names while it is being written. The only way to access dynamic variables is by using pointers. Once it is created, however, a dynamic variable does contain data and must have a type like any other variable. If a dynamic variable is created in a function, then it can continue to exist even after the function terminates.
Linked Linear List
We saw in previous chapters how static representation of linear ordered list through Array leads to wastage of memory and in some cases overflows. Now we don't want to assign memory to any linear list in advance instead we want to allocate memory to elements as they are inserted in list. This requires Dynamic Allocation of memory and it can be achieved by using malloc() or calloc() function.
But memory assigned to elements will not be contiguous, which is a requirement for linear ordered list, and was provided by array representation. How we could achieve this?
Stacks, Queues, and Deques describes different data structures and their properties. Stacks follow LIFO order with insertion and removal at the same end. Queues follow FIFO order with insertion at one end and removal at the other. Deques allow insertion and removal at both ends. These structures can be implemented using arrays or linked lists.
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...Philip Schwarz
The document discusses implementing aggregation functions inductively using recursion or folding. It explains that aggregation functions like sum, count, max can be defined inductively with a base case for empty sequences and a recursive step to process additional elements. Implementing functions recursively risks stack overflow for long inputs, but tail-recursion and accumulator tricks can help avoid this. Folds provide an alternative to recursion for defining aggregations by processing sequences from left to right or right to left.
The Ring programming language version 1.8 book - Part 24 of 202Mahmoud Samir Fayed
The document provides documentation on control structures in Ring including branching, looping, and exceptions. It covers if/elseif/else statements, switch statements, while loops, for loops, for-in loops, try/catch blocks, and various loop commands like exit, loop, and next. Examples are given for each control structure to demonstrate their syntax and usage. Key topics include short-circuit evaluation, using control structures inside functions, and modifying lists within for-in loops.
The document discusses various Java code snippets that test different aspects of conditional expressions and data types in Java. It provides the code snippets, expected output, and in some cases explains why the expected output is what it is based on Java's rules for determining the type of a conditional expression.
For loops allow code to be repeatedly executed until a condition is met. They include initialization, condition, and update statements. While and do-while loops also repeatedly execute code until a condition is met, but check the condition at the start or end of each iteration respectively. Loops are useful for tasks like adding numbers in a range or populating data structures to avoid repetitive code.
(download for flawless quality) State Monad - Learn how it works - Follow Alvin Alexander’s example-driven build up to the State Monad and then branch off into a detailed look at its inner workings.
This document contains 24 points summarizing key Java concepts such as keywords and reserved words, memory allocation parameters, garbage collection, wrapper classes, arrays, static imports, variable types, compilation units, operator precedence, conversions and casting, switch statements, enumeration, and more. It was prepared by Farag Zakaria Safy Saad as a study guide for the Java programming language.
This document provides instructions for implementing a basic four-function calculator using stacks and queues. It describes converting an infix expression to postfix notation, evaluating the postfix expression, and handling parentheses. Code is provided to get started, and the tasks are to complete functions to convert infix to postfix, evaluate postfix expressions, and update infix to postfix to handle parentheses. Test cases should be run to demonstrate the functionality.
This chapter discusses various string manipulation techniques in C, including copying strings, comparing strings, removing trailing/leading spaces, padding strings, and copying portions of strings. The standard C library provides functions like strcpy(), strcmp(), rtrim(), printf(), and memcpy() that are useful for string manipulation tasks. Examples are provided to demonstrate how to implement string operations like removing spaces, justifying text, and copying substrings.
The Ring programming language version 1.9 book - Part 124 of 210Mahmoud Samir Fayed
The document summarizes Ring documentation for various string and file handling functions in Ring including frFile(), frFileToStr(), frStr(), frSetIfEmpty(), frSpace(), frInList(), frForcePath(), frJustExt(), frJustStem(), frJustPath(), frForceExt(), frALines(), frJustFName(), frPadL(), frPadR(), frProper(), frReplicate(), frLen(), frStuff(), frSubStr(), frStrTran(), frListToString(), frInt(), frStringToList(), frIIf(), and frVal(). Each function is described in one sentence noting its purpose and key arguments.
Indus Valley Partner aptitude questions and answersSushant Choudhary
This document contains several code snippets and questions about Java programming concepts like arrays, exceptions, interfaces, and access modifiers. The correct answers and explanations provided indicate that:
1) Option B is the only valid way to declare and initialize an array with multiple elements in one statement.
2) Interface methods can only use the public access modifier and cannot be static, final, private, protected, transient, volatile, or synchronized.
3) The default values for array elements of primitive types are 0 for int, 0.0f for float, and null for reference types like String. For non-primitive types like Dog, the default is also null.
The Ring programming language version 1.8 book - Part 117 of 202Mahmoud Samir Fayed
This document summarizes functions in the Ring programming language for working with files, strings, lists, and numeric values. It describes over 30 functions, including frFile() to check if a file exists, frFileToStr() to get a file's contents as a string, frStr() to convert a number to a string, and frIIf() to conditionally return one of two values. The functions take various arguments like file names, values, and lists and return processed strings, numbers, logical values, or other data types.
non-strict functions, bottom and scala by-name parametersPhilip Schwarz
Download for perfect quality.
Non strict functions, bottom and scala by-name parameters - ‘a close look’, through the work of Runar Bjarnason, Paul Chiusano, Martin Odersky, Bill Venners, Lex Spoon, Alvin Alexander, Mark Lewis and Aleksandar Prokopec.
The Ring programming language version 1.5.1 book - Part 18 of 180Mahmoud Samir Fayed
The document discusses various control structures in Ring including if/elseif/else statements, switch statements, while loops, for loops, and exceptions handling. Examples are provided to demonstrate the syntax and usage of if/elseif/else, switch, while, for, and try/catch structures. Looping constructs like for and while can be used to iterate over a range or list. Logical and short-circuit evaluation rules for conditional expressions are also covered.
I am Gill K. I am an Operating System Assignment Expert at programminghomeworkhelp.com. I hold a PhD. in Programming at Manchester University, UK. I have been helping students with their homework for the past 6 years. I solve assignments related to Operating System Assignment.
Visit programminghomeworkhelp.com or email support@programminghomeworkhelp.com.
You can also call on +1 678 648 4277 for any assistance with Operating System Assignment.
The document discusses new features introduced in Java 8, including lambda expressions, functional interfaces, default and static methods in interfaces, and predefined functional interfaces like Predicate, Function, Consumer, and Supplier. It also covers stream API concepts like filtering, mapping, and processing streams using methods like collect(), count(), sorted(), min(), max(), forEach(), and toArray(). Stream API allows processing collections in a declarative way using lambda expressions and method references.
Linked List Static and Dynamic Memory AllocationProf Ansari
Static variables are declared and named while writing the program. (Space for them exists as long as the program, in which they are declared, is running.) Static variables cannot be created or destroyed during execution of the program in which they are declared.
Dynamic variables are created (and may be destroyed) during program execution since dynamic variables do not exist while the program is compiled, but only when it is run, they cannot be assigned names while it is being written. The only way to access dynamic variables is by using pointers. Once it is created, however, a dynamic variable does contain data and must have a type like any other variable. If a dynamic variable is created in a function, then it can continue to exist even after the function terminates.
Linked Linear List
We saw in previous chapters how static representation of linear ordered list through Array leads to wastage of memory and in some cases overflows. Now we don't want to assign memory to any linear list in advance instead we want to allocate memory to elements as they are inserted in list. This requires Dynamic Allocation of memory and it can be achieved by using malloc() or calloc() function.
But memory assigned to elements will not be contiguous, which is a requirement for linear ordered list, and was provided by array representation. How we could achieve this?
Stacks, Queues, and Deques describes different data structures and their properties. Stacks follow LIFO order with insertion and removal at the same end. Queues follow FIFO order with insertion at one end and removal at the other. Deques allow insertion and removal at both ends. These structures can be implemented using arrays or linked lists.
This document discusses data structures in C, specifically lists and pointers. It covers linked lists as an alternative to arrays for representing lists, with each node containing a data element and pointer to the next node. It describes using pointers in C with address and dereference operators. It also covers dynamically allocating memory with malloc and free. The document discusses implementing singly linked lists, stacks, and queues using pointers. It covers representing polynomials as linked lists and operations like adding and erasing polynomials.
Stacks, queues, and deques are common linear data structures. A stack follows LIFO order, with items added and removed from the same end. A queue follows FIFO order, with items added to one end and removed from the other. A deque allows additions and removals from both ends. These structures can be implemented using either arrays or linked lists. Array implementations require care to avoid overflow and underflow, while linked list versions have efficient addition and removal at both ends.
This document discusses stacks and queues as data structures. It begins by describing stacks as having a last-in, first-out characteristic and common uses like method call order tracking. It then provides examples of implementing a stack using an array and manipulating elements by pushing and popping. The document also covers queues as having a first-in, first-out characteristic like a waiting line. It proposes a Queue interface and discusses challenges of array-based implementations versus using Java's LinkedList class.
Linked Lists: The Role of Locking : The Art of Multiprocessor Programming : N...Subhajit Sahu
This document discusses different techniques for synchronizing access to linked lists in concurrent programs, including coarse-grained locking, fine-grained locking, optimistic synchronization, lazy synchronization, and nonblocking synchronization. It focuses on using linked lists to implement a set data structure and provides code examples of coarse-grained and fine-grained locking approaches. Fine-grained locking uses individual locks on each node to allow for more concurrency during traversal, while still ensuring safe modifications through lock coupling that acquires locks in order along the list.
Given the following ADT definition of a stack to use stack .docxshericehewat
The document provides an abstract data type (ADT) definition for a queue. It includes common queue methods like clear(), isEmpty(), enqueue(), front(), dequeue(), and length(). The question asks to add additional member methods to an array-based queue implementation. Specifically, it provides an existing array-based queue class with attributes like allocSize, numItems, frontIndex, backIndex, and a dynamic items array representing the queue. The question asks to add member methods to this implementation.
The document discusses stacks and queues. It defines stacks as LIFO data structures and queues as FIFO data structures. It describes basic stack operations like push and pop and basic queue operations like enqueue and dequeue. It then discusses implementing stacks and queues using arrays and linked lists, outlining the key operations and memory requirements for each implementation.
The document discusses stacks and queues. It begins by defining a stack as a collection of homogeneous data elements where insertion and deletion occurs at one end, known as the top. The stack follows the LIFO (last in, first out) principle. Common stack operations like push and pop are introduced. Implementing stacks using both arrays and linked lists is covered. Applications of stacks like balancing symbols, recursion, undo operations are mentioned. The document then moves to discussing queues, their applications and implementations. Priority queues and their applications are also briefly covered.
This document provides information about sorting algorithms. It begins with an introduction to sorting, explaining why sorting is important and common sorting algorithms like merge sort, quicksort, heapsort, etc. It then discusses different types of sorting algorithms like comparison-based sorting and specialized sorting. The document proceeds to explain several specific sorting algorithms in detail, including bubble sort, selection sort, insertion sort, merge sort, and bitonic sort. It provides pseudocode for the algorithms and examples to illustrate how they work on sample data. The key details covered are the time complexity of common sorting algorithms and how different algorithms have tradeoffs in terms of speed and memory usage.
This is a rapid fire talk that outline some of the issues around multithreaded code, some of the tools you can use to test it and how best to spend your resources doing so
A queue is a data structure that follows the first-in, first-out (FIFO) principle. Elements are added to the rear of the queue and removed from the front. A queue can be implemented using an array, with indexes to track the front and rear elements. When an element is added (enqueued), the rear index is incremented; when an element is removed (dequeued), the front index is incremented and the element at that index is returned. The queue has a limited capacity based on the size of the array.
The document discusses different implementations of stacks and queues using linked lists and arrays. It describes how stacks can be implemented using a linked list, with push and pop operations adding and removing nodes from the head of the list. Queues can also be implemented with linked lists or arrays, but arrays require additional logic to maintain first-in, first-out order efficiently. Common applications of stacks and queues include evaluating expressions, checking for balanced brackets, and managing tasks in operating systems and server requests.
DSA (Data Structure and Algorithm) QuestionsRESHAN FARAZ
The document contains questions related to data structures and algorithms. Some questions ask to perform operations on data structures like stacks, queues, trees, graphs and hash tables. Other questions involve algorithms like topological sorting, shortest path algorithms, minimum spanning tree algorithms, and binary search tree operations. The document tests understanding of concepts taught in an algorithms course through examples and problem solving questions.
1) Stacks are last-in, first-out (LIFO) data structures that allow efficient insertion and removal of elements from only one end, called the top. Common stack operations are push and pop.
2) Stacks have many applications, including checking for balanced brackets in code, evaluating postfix expressions by using intermediate results, and implementing function calls and recursion by saving activation records on a stack.
3) Stacks can be implemented using either a linked list or an array. The array implementation uses an array and a top pointer to track the top of the stack.
Similar to Concurrent Stacks and Elimination : The Art of Multiprocessor Programming : Notes (20)
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...Subhajit Sahu
TrueTime is a service that enables the use of globally synchronized clocks, with bounded error. It returns a time interval that is guaranteed to contain the clock’s actual time for some time during the call’s execution. If two intervals do not overlap, then we know calls were definitely ordered in real time. In general, synchronized clocks can be used to avoid communication in a distributed system.
The underlying source of time is a combination of GPS receivers and atomic clocks. As there are “time masters” in every datacenter (redundantly), it is likely that both sides of a partition would continue to enjoy accurate time. Individual nodes however need network connectivity to the masters, and without it their clocks will drift. Thus, during a partition their intervals slowly grow wider over time, based on bounds on the rate of local clock drift. Operations depending on TrueTime, such as Paxos leader election or transaction commits, thus have to wait a little longer, but the operation still completes (assuming the 2PC and quorum communication are working).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting Bitset for graph : SHORT REPORT / NOTESSubhajit Sahu
Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is commonly used for efficient graph computations. Unfortunately, using CSR for dynamic graphs is impractical since addition/deletion of a single edge can require on average (N+M)/2 memory accesses, in order to update source-offsets and destination-indices. A common approach is therefore to store edge-lists/destination-indices as an array of arrays, where each edge-list is an array belonging to a vertex. While this is good enough for small graphs, it quickly becomes a bottleneck for large graphs. What causes this bottleneck depends on whether the edge-lists are sorted or unsorted. If they are sorted, checking for an edge requires about log(E) memory accesses, but adding an edge on average requires E/2 accesses, where E is the number of edges of a given vertex. Note that both addition and deletion of edges in a dynamic graph require checking for an existing edge, before adding or deleting it. If edge lists are unsorted, checking for an edge requires around E/2 memory accesses, but adding an edge requires only 1 memory access.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Experiments with Primitive operations : SHORT REPORT / NOTESSubhajit Sahu
This includes:
- Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
- Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
- Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
- Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
Below are the important points I note from the 2020 paper by Martin Grohe:
- 1-WL distinguishes almost all graphs, in a probabilistic sense
- Classical WL is two dimensional Weisfeiler-Leman
- DeepWL is an unlimited version of WL graph that runs in polynomial time.
- Knowledge graphs are essentially graphs with vertex/edge attributes
ABSTRACT:
Vector representations of graphs and relational structures, whether handcrafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view.
Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
https://gist.github.com/wolfram77/54c4a14d9ea547183c6c7b3518bf9cd1
There exist a number of dynamic graph generators. Barbasi-Albert model iteratively attach new vertices to pre-exsiting vertices in the graph using preferential attachment (edges to high degree vertices are more likely - rich get richer - Pareto principle). However, graph size increases monotonically, and density of graph keeps increasing (sparsity decreasing).
Gorke's model uses a defined clustering to uniformly add vertices and edges. Purohit's model uses motifs (eg. triangles) to mimick properties of existing dynamic graphs, such as growth rate, structure, and degree distribution. Kronecker graph generators are used to increase size of a given graph, with power-law distribution.
To generate dynamic graphs, we must choose a metric to compare two graphs. Common metrics include diameter, clustering coefficient (modularity?), triangle counting (triangle density?), and degree distribution.
In this paper, the authors propose Dygraph, a dynamic graph generator that uses degree distribution as the only metric. The authors observe that many real-world graphs differ from the power-law distribution at the tail end. To address this issue, they propose binning, where the vertices beyond a certain degree (minDeg = min(deg) s.t. |V(deg)| < H, where H~10 is the number of vertices with a given degree below which are binned) are grouped into bins of degree-width binWidth, max-degree localMax, and number of degrees in bin with at least one vertex binSize (to keep track of sparsity). This helps the authors to generate graphs with a more realistic degree distribution.
The process of generating a dynamic graph is as follows. First the difference between the desired and the current degree distribution is calculated. The authors then create an edge-addition set where each vertex is present as many times as the number of additional incident edges it must recieve. Edges are then created by connecting two vertices randomly from this set, and removing both from the set once connected. Currently, authors reject self-loops and duplicate edges. Removal of edges is done in a similar fashion.
Authors observe that adding edges with power-law properties dominates the execution time, and consider parallelizing DyGraph as part of future work.
My notes on shared memory parallelism.
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Using memory for communication inside a single program, e.g. among its multiple threads, is also referred to as shared memory [REF].
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESSubhajit Sahu
**Community detection methods** can be *global* or *local*. **Global community detection methods** divide the entire graph into groups. Existing global algorithms include:
- Random walk methods
- Spectral partitioning
- Label propagation
- Greedy agglomerative and divisive algorithms
- Clique percolation
https://gist.github.com/wolfram77/b4316609265b5b9f88027bbc491f80b6
There is a growing body of work in *detecting overlapping communities*. **Seed set expansion** is a **local community detection method** where a relevant *seed vertices* of interest are picked and *expanded to form communities* surrounding them. The quality of each community is measured using a *fitness function*.
**Modularity** is a *fitness function* which compares the number of intra-community edges to the expected number in a random-null model. **Conductance** is another popular fitness score that measures the community cut or inter-community edges. Many *overlapping community detection* methods **use a modified ratio** of intra-community edges to all edges with atleast one endpoint in the community.
Andersen et al. use a **Spectral PageRank-Nibble method** which minimizes conductance and is formed by adding vertices in order of decreasing PageRank values. Andersen and Lang develop a **random walk approach** in which some vertices in the seed set may not be placed in the final community. Clauset gives a **greedy method** that *starts from a single vertex* and then iteratively adds neighboring vertices *maximizing the local modularity score*. Riedy et al. **expand multiple vertices** via maximizing modularity.
Several algorithms for **detecting global, overlapping communities** use a *greedy*, *agglomerative approach* and run *multiple separate seed set expansions*. Lancichinetti et al. run **greedy seed set expansions**, each with a *single seed vertex*. Overlapping communities are produced by a sequentially running expansions from a node not yet in a community. Lee et al. use **maximal cliques as seed sets**. Havemann et al. **greedily expand cliques**.
The authors of this paper discuss a dynamic approach for **community detection using seed set expansion**. Simply marking the neighbours of changed vertices is a **naive approach**, and has *severe shortcomings*. This is because *communities can split apart*. The simple updating method *may fail even when it outputs a valid community* in the graph.
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESSubhajit Sahu
A **community** (in a network) is a subset of nodes which are _strongly connected among themselves_, but _weakly connected to others_. Neither the number of output communities nor their size distribution is known a priori. Community detection methods can be divisive or agglomerative. **Divisive methods** use _betweeness centrality_ to **identify and remove bridges** between communities. **Agglomerative methods** greedily **merge two communities** that provide maximum gain in _modularity_. Newman and Girvan have introduced the **modularity metric**. The problem of community detection is then reduced to the problem of modularity maximization which is **NP-complete**. **Louvain method** is a variant of the _agglomerative strategy_, in that is a _multi-level heuristic_.
https://gist.github.com/wolfram77/917a1a4a429e89a0f2a1911cea56314d
In this paper, the authors discuss **four heuristics** for Community detection using the _Louvain algorithm_ implemented upon recently developed **Grappolo**, which is a parallel variant of the Louvain algorithm. They are:
- Vertex following and Minimum label
- Data caching
- Graph coloring
- Threshold scaling
With the **Vertex following** heuristic, the _input is preprocessed_ and all single-degree vertices are merged with their corresponding neighbours. This helps reduce the number of vertices considered in each iteration, and also help initial seeds of communities to be formed. With the **Minimum label heuristic**, when a vertex is making the decision to move to a community and multiple communities provided the same modularity gain, the community with the smallest id is chosen. This helps _minimize or prevent community swaps_. With the **Data caching** heuristic, community information is stored in a vector instead of a map, and is reused in each iteration, but with some additional cost. With the **Vertex ordering via Graph coloring** heuristic, _distance-k coloring_ of graphs is performed in order to group vertices into colors. Then, each set of vertices (by color) is processed _concurrently_, and synchronization is performed after that. This enables us to mimic the behaviour of the serial algorithm. Finally, with the **Threshold scaling** heuristic, _successively smaller values of modularity threshold_ are used as the algorithm progresses. This allows the algorithm to converge faster, and it has been observed a good modularity score as well.
From the results, it appears that _graph coloring_ and _threshold scaling_ heuristics do not always provide a speedup and this depends upon the nature of the graph. It would be interesting to compare the heuristics against baseline approaches. Future work can include _distributed memory implementations_, and _community detection on streaming graphs_.
Application Areas of Community Detection: A Review : NOTESSubhajit Sahu
This is a short review of Community detection methods (on graphs), and their applications. A **community** is a subset of a network whose members are *highly connected*, but *loosely connected* to others outside their community. Different community detection methods *can return differing communities* these algorithms are **heuristic-based**. **Dynamic community detection** involves tracking the *evolution of community structure* over time.
https://gist.github.com/wolfram77/09e64d6ba3ef080db5558feb2d32fdc0
Communities can be of the following **types**:
- Disjoint
- Overlapping
- Hierarchical
- Local.
The following **static** community detection **methods** exist:
- Spectral-based
- Statistical inference
- Optimization
- Dynamics-based
The following **dynamic** community detection **methods** exist:
- Independent community detection and matching
- Dependent community detection (evolutionary)
- Simultaneous community detection on all snapshots
- Dynamic community detection on temporal networks
**Applications** of community detection include:
- Criminal identification
- Fraud detection
- Criminal activities detection
- Bot detection
- Dynamics of epidemic spreading (dynamic)
- Cancer/tumor detection
- Tissue/organ detection
- Evolution of influence (dynamic)
- Astroturfing
- Customer segmentation
- Recommendation systems
- Social network analysis (both)
- Network summarization
- Privary, group segmentation
- Link prediction (both)
- Community evolution prediction (dynamic, hot field)
<br>
<br>
## References
- [Application Areas of Community Detection: A Review : PAPER](https://ieeexplore.ieee.org/document/8625349)
This paper discusses a GPU implementation of the Louvain community detection algorithm. Louvain algorithm obtains hierachical communities as a dendrogram through modularity optimization. Given an undirected weighted graph, all vertices are first considered to be their own communities. In the first phase, each vertex greedily decides to move to the community of one of its neighbours which gives greatest increase in modularity. If moving to no neighbour's community leads to an increase in modularity, the vertex chooses to stay with its own community. This is done sequentially for all the vertices. If the total change in modularity is more than a certain threshold, this phase is repeated. Once this local moving phase is complete, all vertices have formed their first hierarchy of communities. The next phase is called the aggregation phase, where all the vertices belonging to a community are collapsed into a single super-vertex, such that edges between communities are represented as edges between respective super-vertices (edge weights are combined), and edges within each community are represented as self-loops in respective super-vertices (again, edge weights are combined). Together, the local moving and the aggregation phases constitute a stage. This super-vertex graph is then used as input fof the next stage. This process continues until the increase in modularity is below a certain threshold. As a result from each stage, we have a hierarchy of community memberships for each vertex as a dendrogram.
Approaches to perform the Louvain algorithm can be divided into coarse-grained and fine-grained. Coarse-grained approaches process a set of vertices in parallel, while fine-grained approaches process all vertices in parallel. A coarse-grained hybrid-GPU algorithm using multi GPUs has be implemented by Cheong et al. which grabbed my attention. In addition, their algorithm does not use hashing for the local moving phase, but instead sorts each neighbour list based on the community id of each vertex.
https://gist.github.com/wolfram77/7e72c9b8c18c18ab908ae76262099329
Survey for extra-child-process package : NOTESSubhajit Sahu
Useful additions to inbuilt child_process module.
📦 Node.js, 📜 Files, 📰 Docs.
Please see attached PDF for literature survey.
https://gist.github.com/wolfram77/d936da570d7bf73f95d1513d4368573e
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERSubhajit Sahu
This paper presents two algorithms for efficiently computing PageRank on dynamically updating graphs in a batched manner: DynamicLevelwisePR and DynamicMonolithicPR. DynamicLevelwisePR processes vertices level-by-level based on strongly connected components and avoids recomputing converged vertices on the CPU. DynamicMonolithicPR uses a full power iteration approach on the GPU that partitions vertices by in-degree and skips unaffected vertices. Evaluation on real-world graphs shows the batched algorithms provide speedups of up to 4000x over single-edge updates and outperform other state-of-the-art dynamic PageRank algorithms.
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Subhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024
Fast Incremental Community Detection on Dynamic Graphs : NOTESSubhajit Sahu
In this paper, the authors describe two approaches for dynamic community detection using the CNM algorithm. CNM is a hierarchical, agglomerative algorithm that greedily maximizes modularity. They define two approaches: BasicDyn and FastDyn. BasicDyn backtracks merges of communities until each marked (changed) vertex is its own singleton community. FastDyn undoes a merge only if the quality of merge, as measured by the induced change in modularity, has significantly decreased compared to when the merge initially took place. FastDyn also allows more than two vertices to contract together if in the previous time step these vertices eventually ended up contracted in the same community. In the static case, merging several vertices together in one contraction phase could lead to deteriorating results. FastDyn is able to do this, however, because it uses information from the merges of the previous time step. Intuitively, merges that previously occurred are more likely to be acceptable later.
https://gist.github.com/wolfram77/1856b108334cc822cdddfdfa7334792a
The Future of Wearable Technology in Healthcare: Innovations and Trends to WatchbluetroyvictorVinay
As wearable technology continues to shape multiple facets of our lives, its potential in healthcare is becoming increasingly apparent. With the rapid advancement of technology, the integration of wearables into healthcare systems worldwide is accelerating. In this evolving field, we delve into the latest innovations and trends that are transforming healthcare.
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"Emmanuel Onwumere
In iOS 18, Apple has introduced a significant revamp to the Control Centre, making it more intuitive and user-friendly. One of the standout features is a quicker and more accessible way to shut down your iPhone. This enhancement aims to streamline the user experience, allowing for faster access to essential functions. Discover how iOS 18's redesigned Control Centre can simplify your daily interactions with your iPhone, bringing convenience right at your fingertips.
We’re Underestimating the Damage Extreme Weather Does to Rooftop Solar PanelsGrid Freedom Inc.
Grid Freedom is the best solar leads company based in New Jersey that provides Exclusive solar appointments of qualified solar appointments for guaranteed solar appointments for the best way to get solar leads throughout the nation. Grid Freedom is a solar lead provider, that connects exclusive pre-set appointments with pre-screened homeowners who are ready for solar company leads. The solar lead generators company was founded to provide solar appointment leads contractors with better solar sales leads-buying high-quality exclusive solar leads experience that gives pre-set solar appointments great ROI.
Company Profile of Tempcon - Chiller Manufacturer In Indiasoumotempcon
This is the company profile of Tempcon - chiller manufacturer in India. Tempcon manufactures water cooled and air cooled chillers and industrial AC. The company has been in the business since 1983.
website: https://www.tempcon.co.in/
2. 246 Chapter 11 Concurrent Stacks and Elimination
A:pop()
(b)
ls
top
value value value
A:push()
(a) top
value value
value
a
a
Figure 11.1 A Lock-free stack. In Part (a) a thread pushes value a into the stack by applying
a compareAndSet() to the top field. In Part (b) a thread pops value a from the stack by
applying a compareAndSet() to the top field.
1 public class LockFreeStack<T> {
2 AtomicReference<Node> top = new AtomicReference<Node>(null);
3 static final int MIN_DELAY = ...;
4 static final int MAX_DELAY = ...;
5 Backoff backoff = new Backoff(MIN_DELAY, MAX_DELAY);
6
7 protected boolean tryPush(Node node){
8 Node oldTop = top.get();
9 node.next = oldTop;
10 return(top.compareAndSet(oldTop, node));
11 }
12 public void push(T value) {
13 Node node = new Node(value);
14 while (true) {
15 if (tryPush(node)) {
16 return;
17 } else {
18 backoff.backoff();
19 }
20 }
21 }
Figure 11.2 The LockFreeStack<T> class: in the push() method, threads alternate between
trying to alter the top reference by calling tryPush(), and backing off using the Backoff class
from Fig. 7.5 of Chapter 7.
is empty.) The tryPop() method is called until it succeeds, at which point pop()
returns the value from the removed node.
As we have seen in Chapter 7, one can significantly reduce contention at the
top field using exponential backoff (see Fig. 7.5 of Chapter 7). Accordingly, both
3. 11.2 An Unbounded Lock-Free Stack 247
1 public class Node {
2 public T value;
3 public Node next;
4 public Node(T value) {
5 value = value;
6 next = null;
7 }
8 }
Figure 11.3 Lock-free stack list node.
1 protected Node tryPop() throws EmptyException {
2 Node oldTop = top.get();
3 if (oldTop == null) {
4 throw new EmptyException();
5 }
6 Node newTop = oldTop.next;
7 if (top.compareAndSet(oldTop, newTop)) {
8 return oldTop;
9 } else {
10 return null;
11 }
12 }
13 public T pop() throws EmptyException {
14 while (true) {
15 Node returnNode = tryPop();
16 if (returnNode != null) {
17 return returnNode.value;
18 } else {
19 backoff.backoff();
20 }
21 }
22 }
Figure 11.4 The LockFreeStack<T> class: The pop() method alternates between trying to
change the top field and backing off.
the push() and pop() methods back off after an unsuccessful call to tryPush()
or tryPop().
This implementation is lock-free because a thread fails to complete a push() or
pop() method call only if there were infinitely many successful calls that modified
the top of the stack. The linearization point of both the push() and the pop()
methods is the successful compareAndSet(), or the throwing of the exception,
in Line 3, in case of a pop() on an empty stack. Note that the compareAndSet()
call by pop() does not have an ABA problem (see Chapter 10) because the Java
garbage collector ensures that a node cannot be reused by one thread, as long as
that node is accessible to another thread. Designing a lock-free stack that avoids
the ABA problem without a garbage collector is left as an exercise.
4. 248 Chapter 11 Concurrent Stacks and Elimination
11.3 Elimination
The LockFreeStack implementation scales poorly, not so much because the
stack’s top field is a source of contention, but primarily because it is a sequen-
tial bottleneck: method calls can proceed only one after the other, ordered by
successful compareAndSet() calls applied to the stack’s top field.
Although exponential backoff can significantly reduce contention, it does
nothing to alleviate the sequential bottleneck. To make the stack parallel, we
exploit this simple observation: if a push() is immediately followed by a pop(),
the two operations cancel out, and the stack’s state does not change. It is as if
both operations never happened. If one could somehow cause concurrent pairs
of pushes and pops to cancel, then threads calling push() could exchange val-
ues with threads calling pop(), without ever modifying the stack itself. These two
calls would eliminate one another.
As depicted in Fig. 11.5, threads eliminate one another through an
EliminationArray in which threads pick random array entries to try to meet
complementary calls. Pairs of complementary push() and pop() calls exchange
values and return. A thread whose call cannot be eliminated, either because it has
failed to find a partner, or found a partner with the wrong kind of method call
(such as a push() meeting a push()), can either try again to eliminate at a new
location, or can access the shared LockFreeStack. The combined data structure,
array, and shared stack, is linearizable because the shared stack is linearizable,
and the eliminated calls can be ordered as if they happened at the point in which
they exchanged values.
We can use the EliminationArray as a backoff scheme on a shared
LockFreeStack. Each thread first accesses the LockFreeStack, and if it fails
A:pop()
top
d e fB:push(b)
C:pop()
B:return()
A:return(b)
C:return(d)
Figure 11.5 The EliminationBackoffStack<T> class. Each thread selects a random loca-
tion in the array. If thread A’s pop() and B’s push() calls arrive at the same location at about
the same time, then they exchange values without accessing the shared LockFreeStack.
Thread C that does not meet another thread eventually pops the shared LockFreeStack.
5. 11.4 The Elimination Backoff Stack 249
to complete its call (that is, the compareAndSet() attempt fails), it attempts to
eliminate its call using the array instead of simply backing off. If it fails to elimi-
nate itself, it calls the LockFreeStack again, and so on. We call this structure an
EliminationBackoffStack.
11.4 The Elimination Backoff Stack
Here is how to construct an EliminationBackoffStack, a lock-free linearizable
stack implementation.
We are reminded of a story about two friends who are discussing poli-
tics on election day, each trying, to no avail, to convince the other to switch
sides.
Finally, one says to the other: “Look, it’s clear that we are unalterably opposed
on every political issue. Our votes will surely cancel out. Why not save ourselves
some time and both agree to not vote today?”
The other agrees enthusiastically and they part.
Shortly after that, a friend of the first one who had heard the conversation says,
“That was a sporting offer you made.”
“Not really,” says the second. “This is the third time I’ve done this today.”
The principle behind our construction is the same. We wish to allow threads
with pushes and pops to coordinate and cancel out, but must avoid a situation in
which a thread can make a sporting offer to more than one other thread. We do
so by implementing the EliminationArray using coordination structures called
exchangers, objects that allow exactly two threads (and no more) to rendezvous
and exchange values.
We already saw how to exchange values using locks in the synchronous queue
of Chapter 10. Here, we need a lock-free exchange, one in which threads spin
rather than block, as we expect them to wait only for very short durations.
11.4.1 A Lock-Free Exchanger
A LockFreeExchanger<T> object permits two threads to exchange values of
type T. If thread A calls the object’s exchange() method with argument a, and
B calls the same object’s exchange() method with argument b, then A’s call will
return value b and vice versa. On a high level, the exchanger works by having the
first thread arrive to write its value, and spin until a second arrives. The second
then detects that the first is waiting, reads its value, and signals the exchange.
They each have now read the other’s value, and can return. The first thread’s call
may timeout if the second does not show up, allowing it to proceed and leave the
exchanger, if it is unable to exchange a value within a reasonable duration.
6. 250 Chapter 11 Concurrent Stacks and Elimination
1 public class LockFreeExchanger<T> {
2 static final int EMPTY = ..., WAITING = ..., BUSY = ...;
3 AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0);
4 public T exchange(T myItem, long timeout, TimeUnit unit)
5 throws TimeoutException {
6 long nanos = unit.toNanos(timeout);
7 long timeBound = System.nanoTime() + nanos;
8 int[] stampHolder = {EMPTY};
9 while (true) {
10 if (System.nanoTime() > timeBound)
11 throw new TimeoutException();
12 T yrItem = slot.get(stampHolder);
13 int stamp = stampHolder[0];
14 switch(stamp) {
15 case EMPTY:
16 if (slot.compareAndSet(yrItem, myItem, EMPTY, WAITING)) {
17 while (System.nanoTime() < timeBound){
18 yrItem = slot.get(stampHolder);
19 if (stampHolder[0] == BUSY) {
20 slot.set(null, EMPTY);
21 return yrItem;
22 }
23 }
24 if (slot.compareAndSet(myItem, null, WAITING, EMPTY)) {
25 throw new TimeoutException();
26 } else {
27 yrItem = slot.get(stampHolder);
28 slot.set(null, EMPTY);
29 return yrItem;
30 }
31 }
32 break;
33 case WAITING:
34 if (slot.compareAndSet(yrItem, myItem, WAITING, BUSY))
35 return yrItem;
36 break;
37 case BUSY:
38 break;
39 default: // impossible
40 ...
41 }
42 }
43 }
44 }
Figure 11.6 The LockFreeExchanger<T> Class.
The LockFreeExchanger<T> class appears in Fig. 11.6. It has a single
AtomicStampedReference<T> field,1
slot. The exchanger has three possible
states: EMPTY, BUSY, or WAITING. The reference’s stamp records the exchanger’s
state (Line 14). The exchanger’s main loop continues until the timeout limit
1 See Chapter 10, Pragma 10.6.1.
7. 11.4 The Elimination Backoff Stack 251
passes, when it throws an exception (Line 10). In the meantime, a thread reads
the state of the slot (Line 12) and proceeds as follows:
If the state is EMPTY, then the thread tries to place its item in the slot and set
the state to WAITING using a compareAndSet() (Line 16). If it fails, then some
other thread succeeds and it retries. If it was successful (Line 17), then its item
is in the slot and the state is WAITING, so it spins, waiting for another thread
to complete the exchange. If another thread shows up, it will take the item in
the slot, replace it with its own, and set the state to BUSY (Line 19), indicat-
ing to the waiting thread that the exchange is complete. The waiting thread
will consume the item and reset the state to EMPTY. Resetting to EMPTY can be
done using a simple write because the waiting thread is the only one that can
change the state from BUSY to EMPTY (Line 20). If no other thread shows up,
the waiting thread needs to reset the state of the slot to EMPTY. This change
requires a compareAndSet() because other threads might be attempting to
exchange by setting the state from WAITING to BUSY (Line 24). If the call is suc-
cessful, it raises a timeout exception. If, however, the call fails, some exchang-
ing thread must have shown up, so the waiting thread completes the exchange
(Line 26).
If the state is WAITING, then some thread is waiting and the slot contains its
item. The thread takes the item, and tries to replace it with its own by changing
the state from WAITING to BUSY using a compareAndSet() (Line 34). It may
fail if another thread succeeds, or the other thread resets the state to EMPTY
following a timeout. If so, the thread must retry. If it does succeed changing
the state to BUSY, then it can return the item.
If the state is BUSY then two other threads are currently using the slot for an
exchange and the thread must retry (Line 37).
Notice that the algorithm allows the inserted item to be null, something
used later in the elimination array construction. There is no ABA problem
because the compareAndSet() call that changes the state never inspects the item.
A successful exchange’s linearization point occurs when the second thread to
arrive changes the state from WAITING to BUSY (Line 34). At this point both
exchange() calls overlap, and the exchange is committed to being successful. An
unsuccessful exchange’s linearization point occurs when the timeout exception is
thrown.
The algorithm is lock-free because overlapping exchange() calls with suffi-
cient time to exchange will fail only if other exchanges are repeatedly succeeding.
Clearly, too short an exchange time can cause a thread never to succeed, so care
must be taken when choosing timeout durations.
11.4.2 The Elimination Array
An EliminationArray is implemented as an array of Exchanger objects of
maximal size capacity. A thread attempting to perform an exchange picks an
array entry at random, and calls that entry’s exchange() method, providing
8. 252 Chapter 11 Concurrent Stacks and Elimination
1 public class EliminationArray<T> {
2 private static final int duration = ...;
3 LockFreeExchanger<T>[] exchanger;
4 Random random;
5 public EliminationArray(int capacity) {
6 exchanger = (LockFreeExchanger<T>[]) new LockFreeExchanger[capacity];
7 for (int i = 0; i < capacity; i++) {
8 exchanger[i] = new LockFreeExchanger<T>();
9 }
10 random = new Random();
11 }
12 public T visit(T value, int range) throws TimeoutException {
13 int slot = random.nextInt(range);
14 return (exchanger[slot].exchange(value, duration,
15 TimeUnit.MILLISECONDS));
16 }
17 }
Figure 11.7 The EliminationArray<T> class: in each visit, a thread can choose dynamically
the sub-range of the array from which it will will randomly select a slot.
its own input as an exchange value with another thread. The code for the
EliminationArray appears in Fig. 11.7. The constructor takes as an argu-
ment the capacity of the array (the number of distinct exchangers). The
EliminationArray class provides a single method, visit(), which takes time-
out arguments. (Following the conventions used in the java.util.concurrent pack-
age, a timeout is expressed as a number and a time unit.) The visit() call takes
a value of type T and either returns the value input by its exchange partner,
or throws an exception if the timeout expires without exchanging a value with
another thread. At any point in time, each thread will select a random location in
a subrange of the array (Line 13). This subrange will be determined dynamically
based on the load on the data structure, and will be passed as a parameter to the
visit() method.
The EliminationBackoffStack is a subclass of LockFreeStack that over-
rides the push() and pop() methods, and adds an EliminationArray field.
Figs. 11.8 and 11.9 show the new push() and pop() methods. Upon failure of
a tryPush() or tryPop() attempt, instead of simply backing off, these methods
try to use the EliminationArray to exchange values (Lines 15 and 34). A push()
call calls visit() with its input value as argument, and a pop() call with null as
argument. Both push() and pop() have a thread-local RangePolicy object that
determines the EliminationArray subrange to be used.
When push() calls visit(), it selects a random array entry within its range
and attempts to exchange a value with another thread. If the exchange is suc-
cessful, the pushing thread checks whether the value was exchanged with a pop()
method (Line 18) by testing if the value exchanged was null. (Recall that pop()
always offers null to the exchanger while push() always offers a non-null value.)
Symmetrically, when pop() calls visit(), it attempts an exchange, and if the
9. 11.4 The Elimination Backoff Stack 253
1 public class EliminationBackoffStack<T> extends LockFreeStack<T> {
2 static final int capacity = ...;
3 EliminationArray<T> eliminationArray = new EliminationArray<T>(capacity);
4 static ThreadLocal<RangePolicy> policy = new ThreadLocal<RangePolicy>() {
5 protected synchronized RangePolicy initialValue() {
6 return new RangePolicy();
7 }
8
9 public void push(T value) {
10 RangePolicy rangePolicy = policy.get();
11 Node node = new Node(value);
12 while (true) {
13 if (tryPush(node)) {
14 return;
15 } else try {
16 T otherValue = eliminationArray.visit
17 (value, rangePolicy.getRange());
18 if (otherValue == null) {
19 rangePolicy.recordEliminationSuccess();
20 return; // exchanged with pop
21 }
22 } catch (TimeoutException ex) {
23 rangePolicy.recordEliminationTimeout();
24 }
25 }
26 }
27 }
Figure 11.8 The EliminationBackoffStack<T> class: this push() method overrides the
LockFreeStack push() method. Instead of using a simple Backoff class, it uses an
EliminationArray and a dynamic RangePolicy to select the subrange of the array within
which to eliminate.
28 public T pop() throws EmptyException {
29 RangePolicy rangePolicy = policy.get();
30 while (true) {
31 Node returnNode = tryPop();
32 if (returnNode != null) {
33 return returnNode.value;
34 } else try {
35 T otherValue = eliminationArray.visit(null, rangePolicy.getRange());
36 if (otherValue != null) {
37 rangePolicy.recordEliminationSuccess();
38 return otherValue;
39 }
40 } catch (TimeoutException ex) {
41 rangePolicy.recordEliminationTimeout();
42 }
43 }
44 }
Figure 11.9 The EliminationBackoffStack<T> class: this pop() method overrides the
LockFreeStack push() method.
10. 254 Chapter 11 Concurrent Stacks and Elimination
exchange is successful it checks (Line 36) whether the value was exchanged with
a push() call by checking whether it is not null.
It is possible that the exchange will be unsuccessful, either because no
exchange took place (the call to visit() timed out) or because the exchange
was with the same type of operation (such as a pop() with a pop()). For brevity,
we choose a simple approach to deal with such cases: we retry the tryPush() or
tryPop() calls (Lines 13 and 31).
One important parameter is the range of the EliminationArray from which
a thread selects an Exchanger location. A smaller range will allow a greater
chance of a successful collision when there are few threads, while a larger range
will lower the chances of threads waiting on a busy Exchanger (recall that an
Exchanger can only handle one exchange at a time). Thus, if few threads access
the array, they should choose smaller ranges, and as the number of threads
increase, so should the range. One can control the range dynamically using a
RangePolicy object that records both successful exchanges (as in Line 37) and
timeout failures (Line 40). We ignore exchanges that fail because the operations
do not match (such as push() with push()), because they account for a fixed frac-
tion of the exchanges for any given distribution of push() and pop() calls. One
simple policy is to shrink the range as the number of failures increases and vice
versa.
There are many other possible policies. For example, one can devise a more
elaborate range selection policy, vary the delays on the exchangers dynamically,
add additional backoff delays before accessing the shared stack, and control
whether to access the shared stack or the array dynamically. We leave these as
exercises.
The EliminationBackoffStack is a linearizable stack: any successful push()
or pop() call that completes by accessing the LockFreeStack can be linearized at
the point of its LockFreeStack access. Any pair of eliminated push() and pop()
calls can be linearized when they collide. As noted earlier, the method calls com-
pleted through elimination do not affect the linearizability of those completed
in the LockFreeStack, because they could have taken effect in any state of the
LockFreeStack, and having taken effect, the state of the LockFreeStack would
not have changed.
Because the EliminationArray is effectively used as a backoff scheme, we
expect it to deliver performance comparable to the LockFreeStack at low loads.
Unlike the LockFreeStack, it has the potential to scale. As the load increases, the
number of successful eliminations will grow, allowing many operations to com-
plete in parallel. Moreover, contention at the LockFreeStack is reduced because
eliminated operations never access the stack.
11.5 Chapter Notes
The LockFreeStack is credited to Treiber [145]. Actually it predates Treiber’s
report in 1986. It was probably invented in the early 1970s to motivate the
11. 11.6 Exercises 255
CAS operation on the IBM 370. The EliminationBackoffStack is due to
Danny Hendler, Nir Shavit, and Lena Yerushalmi [57]. An efficient exchanger,
which quite interestingly uses an elimination array, was introduced by Doug Lea,
Michael Scott, and Bill Scherer [136]. A variant of this exchanger appears in the
Java Concurrency Package. The EliminationBackoffStack we present here is
modular, making use of exchangers, but somewhat inefficient. Mark Moir, Daniel
Nussbaum, Ori Shalev, and Nir Shavit present a highly effective implementation
of an EliminationArray [120].
11.6 Exercises
Exercise 126. Design an unbounded lock-based Stack<T> implementation based
on a linked list.
Exercise 127. Design a bounded lock-based Stack<T> using an array.
1. Use a single lock and a bounded array.
2. Try to make your algorithm lock-free. Where do you run into difficulty?
Exercise 128. Modify the unbounded lock-free stack of Section 11.2 to work
in the absence of a garbage collector. Create a thread-local pool of preallo-
cated nodes and recycle them. To avoid the ABA problem, consider using the
AtomicStampedReference<T> class from java.util.concurrent.atomic that encap-
sulates both a reference and an integer stamp.
Exercise 129. Discuss the backoff policies used in our implementation. Does it
make sense to use the same shared Backoff object for both pushes and pops in
our LockFreeStack<T> object? How else could we structure the backoff in space
and time in the EliminationBackoffStack<T>?
Exercise 130. Implement a stack algorithm assuming there is a bound, in any state
of the execution, on the total difference between the number of pushes and pops
to the stack.
Exercise 131. Consider the problem of implementing a bounded stack using an
array indexed by a top counter, initially zero. In the absence of concurrency, these
methods are almost trivial. To push an item, increment top to reserve an array
entry, and then store the item at that index. To pop an item, decrement top, and
return the item at the previous top index.
Clearly, this strategy does not work for concurrent implementations, because
one cannot make atomic changes to multiple memory locations. A single
synchronization operation can either increment or decrement the top counter,
but not both, and there is no way atomically to increment the counter and store
a value.
12. 256 Chapter 11 Concurrent Stacks and Elimination
Nevertheless, Bob D. Hacker decides to solve this problem. He decides to adapt
the dual-data structure approach of Chapter 10 to implement a dual stack. His
DualStack<T> class splits push() and pop() methods into reservation and ful-
fillment steps. Bob’s implementation appears in Fig. 11.10.
The stack’s top is indexed by the top field, an AtomicInteger manipu-
lated only by getAndIncrement() and getAndDecrement() calls. Bob’s push()
1 public class DualStack<T> {
2 private class Slot {
3 boolean full = false;
4 volatile T value = null;
5 }
6 Slot[] stack;
7 int capacity;
8 private AtomicInteger top = new AtomicInteger(0); // array index
9 public DualStack(int myCapacity) {
10 capacity = myCapacity;
11 stack = (Slot[]) new Object[capacity];
12 for (int i = 0; i < capacity; i++) {
13 stack[i] = new Slot();
14 }
15 }
16 public T pop() throws EmptyException {
17 while (true) {
18 int i = top.getAndDecrement();
19 if (i <= 0) { // is stack empty?
20 throw new EmptyException();
21 } else if (i-1 < capacity){
22 while (!stack [i-1].full ) {};
23 T value = stack[i-1].value;
24 stack[i-1].full = false;
25 return value ; //pop fulfilled
26 }
27 }
28 }
29 public T pop() throws EmptyException {
30 while (true) {
31 int i = top.getAndDecrement();
32 if (i < 0) { // is stack empty?
33 throw new EmptyException();
34 } else if (i < capacity - 1) {
35 while (!stack[i].full){};
36 T value = stack[i].value;
37 stack[i].full = false;
38 return value; //pop fulfilled
39 }
40 }
41 }
42 }
Figure 11.10 Bob’s problematic dual stack.
13. 11.6 Exercises 257
method’s reservation step reserves a slot by applying getAndIncrement() to top.
Suppose the call returns index i. If i is in the range 0...capacity − 1, the reser-
vation is complete. In the fulfillment phase, push(x) stores x at index i in the
array, and raises the full flag to indicate that the value is ready to be read. The
value field must be volatile to guarantee that once flag is raised, the value
has already been written to index i of the array.
If the index returned from push()’s getAndIncrement() is less than 0,
the push() method repeatedly retries getAndIncrement() until it returns an
index greater than or equal to 0. The index could be less than 0 due to
getAndDecrement() calls of failed pop() calls to an empty stack. Each such failed
getAndDecrement() decrements the top by one more past the 0 array bound.
If the index returned is greater than capacity −1, push() throws an exception
because the stack is full.
The situation is symmetric for pop(). It checks that the index is within the
bounds and removes an item by applying getAndDecrement() to top, returning
index i. If i is in the range 0...capacity−1, the reservation is complete. For the
fulfillment phase, pop() spins on the full flag of array slot i, until it detects that
the flag is true, indicating that the push() call is successful.
What is wrong with Bob’s algorithm? Is this an inherent problem or can you
think of a way to fix it?
Exercise 132. In Exercise 97 we ask you to implement the Rooms interface, repro-
duced in Fig. 11.11. The Rooms class manages a collection of rooms, indexed
from 0 to m (where m is a known constant). Threads can enter or exit any
room in that range. Each room can hold an arbitrary number of threads simul-
taneously, but only one room can be occupied at a time. The last thread to
leave a room triggers an onEmpty() handler, which runs while all rooms are
empty.
Fig. 11.12 shows an incorrect concurrent stack implementation.
1. Explain why this stack implementation does not work.
2. Fix it by adding calls to a two-room Rooms class: one room for pushing and
one for popping.
1 public interface Rooms {
2 public interface Handler {
3 void onEmpty();
4 }
5 void enter(int i);
6 boolean exit();
7 public void setExitHandler(int i, Rooms.Handler h) ;
8 }
Figure 11.11 The Rooms interface.
14. 258 Chapter 11 Concurrent Stacks and Elimination
1 public class Stack<T> {
2 private AtomicInteger top;
3 private T[] items;
4 public Stack(int capacity) {
5 top = new AtomicInteger();
6 items = (T[]) new Object[capacity];
7 }
8 public void push(T x) throws FullException {
9 int i = top.getAndIncrement();
10 if (i >= items.length) { // stack is full
11 top.getAndDecrement(); // restore state
12 throw new FullException();
13 }
14 items[i] = x;
15 }
16 public T pop() throws EmptyException {
17 int i = top.getAndDecrement() - 1;
18 if (i < 0) { // stack is empty
19 top.getAndIncrement(); // restore state
20 throw new EmptyException();
21 }
22 return items[i];
23 }
24 }
Figure 11.12 Unsynchronized concurrent stack.
Exercise 133. This exercise is a follow-on to Exercise 132. Instead of having the
push() method throw FullException, exploit the push room’s exit handler to
resize the array. Remember that no thread can be in any room when an exit han-
dler is running, so (of course) only one exit handler can run at a time.