A design approach to concurrent containers proposed in year 2009 and recently turned into C++ standard proposal: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0652r0.html
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
The price of similarity, or whether to say 'NO' to STL
1. Intel Dynamic Execution Environments Symposium
Collaborate Innovate Strategize Lead
Session III: Productization Issues
The price of similarity,
or whether to say ‘NO’ to STL
Anton Malakhov
Alexey Kukanov
Arch Robison
3. 3
IDEES’09
Why STL similarity?
STL is C++ standard, authority, and a role model
Many libraries try STL similarity for parallelism:
TBB, PPL, STAPL, PSTL, Range Partition Adapters, Thrust, Glift
But only TBB and PPL try to closely follow STL
requirements for concurrent containers
Now, we can sum up this experience
4. 4
IDEES’09
STL is incompatible with concurrency:
lack of atomicity
C++ standard prescribes interfaces for efficient
serial manipulation of containers
item = queue.front(); // Get the first item
queue.pop(); // Remove it from the queue
front() returns a reference to avoid overhead of return-by-value
Interface precludes safe concurrent manipulation
Two threads cannot grab items from the queue concurrently
front() and pop() operations might race in unfortunate ways
Race is induced by the interface and can not be fixed by any
implementation technique from inside
The interface is too orthogonal to maintain
level of atomicity required for correctness
5. 5
IDEES’09
STL is incompatible with concurrency:
Control of lifetime and access
Destruction and deallocation of an element is only
safe if all other threads are done accessing it
Supporting concurrent erase() and following STL
semantics is not feasible for concurrent containers
How we
ppl::concurrent_unordered_map drops concurrent erase()
tbb::concurrent_hash_map introduces accessor
• a smart pointer to prevent premature destruction
STL lacks support for synchronizing
object access and lifetime
6. 6
IDEES’09
Strict STL similarity is inefficient:
Decomposition vs. Merge
Decomposition of entities is a key STL principle
C++0x expectedly follows it
class shared_ptr { atomic ops }; class mutex { atomic ops };
function F { ptr = shared_ptr; ptr->lock(); } // 2 atomic ops
Merge of entities can be optimized for efficiency
class shared_lockable_ptr { atomic ops };
function F { ptr = shared_lockable_ptr.lock(); } // 1 atomic op
The interface is too orthogonal to maintain
level of atomicity required for efficiency
7. 7
IDEES’09
STL is not efficient for concurrency:
Iterators vs. Ranges
4 of 5 iterator kinds are inherently serial
Abstraction of pointer bumping
Enabling concurrency-safe iteration is tricky
Concurrent modifications can invalidate iterators
STL ranges only defined implicitly
Abstraction of a linear sequence: [first, last)
Carried dependency precludes efficient parallel iteration
• Only random-access iterators allow it
Generic recursively divisible range would be better
E.g. enable scalable parallelism in breadth-first tree traversal
STL abstracted a serial idiom
unsuitable for parallel programming
8. 8
IDEES’09
Concurrent interface is not efficient
for serial usage
Objects can be used both in serial and in parallel
Due to data sharing between program stages
It’s also desirable to add parallelism incrementally
Concurrent interface can require code changes
Due to semantic differences discussed earlier
Contradictory to incremental parallelization
Serial performance can suffer
Traded for better concurrency in implementations
Impacted by synchronization overhead (even without contention)
Enabling concurrency imposes additional cost
even to serial stages of a program
9. 9
IDEES’09
Concurrency as an STL extension:
The root of the problems
STL similarity bends
concurrent interface
Concurrency is afterthought
for serial world of STL?
Concurrency puts
pressure on serial code
Undesired impact of
parallelism adoption
Separate the interfaces
Release from straitjackets!
STL requirements and principles
STL-like class {
Serial methods
};
Concurrent methods
Parallel principles
Serial Parallel execution Serial
Object lifetime
10. 10
IDEES’09
Dual Interfaces
Same data structure,
two interfaces
Concurrent and serial views
STL compatibility
Enables smooth migration
Enabled efficiency
Concurrent interface drops
burden of STL similarity
Serial performance is
available with serial interface
Parallel principles and practices
class ConcurrentFoo {
Concurrent methods
SerialViewOfFoo serial()
};
STL requirements and principles
class SerialViewOfFoo {
Serial methods, up to
100% STL compatible!
};
11. 11
IDEES’09
Dual interfaces: the silver bullet?
Writing “concurrent”, thinking “STL”
Some C++ developers still expect STL behavior or guarantees
• An example:
• Const methods are expected to do no visible modification
• Modifications might be undetectable by concurrent interface
• BUT still visible by serial interface, e.g. as broken iterators
Can indirectly affect concurrent interface and implementation
Additional burden on users
Bigger “surface area” to learn
Special rules to remember & follow
• E.g. switch between interfaces – when and how?
Will mainstream customers buy in the idea?
Solves the problems but puts new questions
12. 12
IDEES’09
Summary
STL compatibility is important for customers
STL similarity in concurrent interface has problems
Principles of STL and concurrency contradict
Separate interfaces for serial and concurrent
access to the same object can do better
Still some questions remain open
14. 14
IDEES’09
Dual interfaces: composability rules
General rule is:
“Do not mix concurrent and serial interfaces”
Interfaces are not compatible with each other anyway
It enables smooth migration and efficiency
Only one reference using only one type
must be passed to a function
If callers are unknown
• Don’t switch interfaces inside
If subroutine is unknown
• Don’t switch interfaces during its runtime
For mixed interface, composability is worse
It is unknown what interface to use
Or use concurrent by default, but
• The whole access code must be changed
• Serial code loses efficiency
• e.g. in const-serial-access of a huge parallel region
//Thread-safe R/W
ConcurrentFoo crw;
//Thread-safe R/O
const SerialFoo cro;
//Serial R/W
SerialFoo srw;
15. 15
IDEES’09
Links
Many libraries try STL similarity for parallelism:
TBB and PPL
STAPL
PSTL
Range Partition Adapters
Thrust: NVIDIA's STL
Glift