Contents
Readings
Blogs
- Anthony Williams
- Bartosz Milewski
- Concurrency Freaks
- Comparison: Lockless programming with atomics in C++ 11 vs. mutex and RW-locks
- Fabian “ryg” Giesen
- Fast Bounded-Concurrency Hash Tables
- Jeff Preshing
- John Regehr
- John Wickerson
- Kukuryku Hub Series
- "Moody Camel" Series:
- PSA: you should use WTF::Lock and WTF::Condition instead of WTF::SpinLock, WTF::Mutex, WTF::ThreadCondition, std::mutex, std::condition_variable, or std::condition_variable_any
- Raymond Chen
- The difficulty of lock-free programming: a bug in lockfree stack
- The x86 Memory Model
- Trip Report: Ad-Hoc Meeting on Threads in C++
Dissertations
- Compiler optimisations and relaxed memory consistency models
- Correct Compilation of Relaxed Memory Concurrency
- Designing Memory Consistency Models For Shared-Memory Multiprocessors
- Memory Consistency Models for Shared-Memory Multiprocessors
- The C11 and C++11 Concurrency Model
- 2014 PhD dissertation; Mark Batty
- https://www.cs.kent.ac.uk/people/staff/mjb211/docs/toc.pdf
- 2015 SIGPLAN John C. Reynolds Doctoral Dissertation award citation: "Mark Batty’s dissertation makes significant contributions to the understanding of memory models for C and C++. The ISO C++ committee proposed a design for C and C++ concurrency that was not up to the task of capturing a realistic relaxed-memory concurrency model. Batty’s work uncovered a number of subtle and serious flaws in the design, and produced an improved design in completely rigorous and machine-checked mathematics. Using software tools to explore the consequences of the design, derived directly from the mathematics, it showed that it has the desired behavior on many examples, and developed mechanized proofs that the design meets some of the original goals, showing that for programs in various subsets of the language one can reason in simpler models. The standards committee have adopted this work in their C11, C++11, and C++14 standards. The members of the award committee were impressed with the quality of the work, the impact it has had on the standardization process for C++, and the clarity of the presentation."
- The Semantics of Multicopy Atomic ARMv8 and RISC-V
Papers - Data Structures
- Practical lock freedom
- Practical lock-free data structures
- Systems programming: Coping with parallelism
- References: Synchrobench 30+ data structures papers
- References: CDS C++ library data structures
Papers - Implementation
- Common Compiler Optimisations are Invalid in the C11 Memory Model and what we can do about it
- N4455: No Sane Compiler Would Optimize Atomics
- Partially Redundant Fence Elimination for x86, ARM, and Power Processors
Papers - Memory Model
- A Promising Semantics for Relaxed-Memory Concurrency
- AutoMO: Automatic Inference of Memory Order Parameters for C/C++11
- Bridging the Gap Between Programming Languages and Hardware Weak Memory Models
- C++ Memory Model
- Concurrency memory model compiler consequences
- Foundations of the C++ Concurrency Memory Model
- Memory Barriers: a Hardware View for Software Hackers
- Memory Model = Instruction Reordering + Store Atomicity
- Memory Models for C/C++ Programmers
- Memory Models: A Case for Rethinking Parallel Languages and Hardware
- On Library Correctness under Weak Memory Consistency
- Programming Languages & Verification – MPI SWS
- Shared Memory Consistency Models: A Tutorial
- Simple and Efficient Semantics for Concurrent Programming Languages
- Synthesizing Memory Models from Framework Sketches and Litmus Tests
- PLDI 2017
- James Bornholt and Emina Torlak
- MemSynth: Synthesis-Aided Memory Model Development
- The Silently Shifting Semicolon
- SNAPL 2015
- Daniel Marino, Todd D. Millstein, Madanlal Musuvathi, Satish Narayanasamy, Abhayendra Singh, Madan Musuvathi
- https://www.microsoft.com/en-us/research/publication/the-silently-shifting-semicolon/
- http://web.cs.ucla.edu/~todd/research/snapl15.pdf
- Proposes the designation "Weak DRF0 (WDRF0)" for the C++ memory model: "C/C++ has settled for a memory model weaker than DRF0, which we call Weak DRF0 (WDRF0). DRF programs are not guaranteed SC semantics in WDRF0. To get SC, programmers have to additionally avoid the use of the so-called low-level atomic primitives. The weak semantics of DRF programs in C++ is similar in complexity to the semantics of non-DRF programs in Java."
- Threads Basics
- Threads Cannot be Implemented as a Library
- Verifying C11 Programs Operationally
- x86-TSO: A Rigorous and Usable Programmer’s Model for x86 Multiprocessors
- You Don't Know Jack about Shared Variables or Memory Models: Data Races are Evil
References
- 1024cores
- C++11 Language Extensions — Concurrency
- C++11 Standard Library Extensions — Concurrency
- C/C++11 mappings to processors
- GCC Wiki - Atomic: https://gcc.gnu.org/wiki/Atomic/
- GCC Wiki - The C++11 Memory Model and GCC: https://gcc.gnu.org/wiki/Atomic/GCCMM
- glibc wiki: Concurrency
- Linux kernel memory barriers
- LLVM Atomic Instructions and Concurrency Guide
- Memory model - http://en.cppreference.com/w/cpp/language/memory_model
- N1276: A Less Formal Explanation of the Proposed C++ Concurrency Memory Model
- Programming with Threads: Questions Frequently Asked by C and C++ Programmers
- REMS: Rigorous Engineering for Mainstream Systems
- Some notes on lock-free and wait-free algorithms
- The Check Tool Suite: Programmability, Correctness and Security Issues in Heterogeneous Multiprocessor and Mobile Systems
- Threads and memory model for C++
- What every systems programmer should know about lockless concurrency
- Why the "volatile" type class should not be used
Books
- A Primer on Memory Consistency and Cache Coherence, Second Edition
- Is Parallel Programming Hard, And If So, What Can You Do About It?
Courses
- Advanced Computer Architecture - University of Utah - CS/ECE 7810
- http://www.eng.utah.edu/~cs7810/
- Lecture 11: Consistency Models: Slides (pdf)
- YouTube Video 68 (Example multi-threaded programs and sequentially consistent results)
- YouTube Video 69 (Hardware support for sequential consistency, example of how SC is violated if program order is violated)
- YouTube Video 70 (Example on how a coherence protocol may violate write atomicity and sequential consistency, hardware support for sequential consistency, safe optimizations to speed up the hardware)
- YouTube Video 71 (A hardware-software approach to improving performance with relaxed consistency models and fences)
- Parallel Computer Architecture - CMU - 18-742
Software
- act: automagic compiler tormentor
- act is a toolbox for finding concurrency memory model discrepancies between C code and its compiled assembly.
- It can use memalloy as a test-case generator, and generates litmus tests that can be used with herd7.
- https://github.com/MattWindsor91/act/
- CDSChecker: A Model Checker for C11 and C++11 Atomics
- CppMem: Interactive C/C++ memory model
- diy
- The sofware suite diy provides tools to design and test weak memory models. It handles ARMv8 (AArch64), ARMv7 (ARM), Power (PPC) and X86 assembly models, plus a generic (LISA) assembly language.
- http://diy.inria.fr/
- herd, a memory model simulator
- MemSynth: An advanced automated reasoning tool for memory consistency model specifications.
- Relacy Race Detector
- RMEM
- Synchrobench
Software - Data Structures
- ASCYLIB
- https://github.com/LPD-EPFL/ASCYLIB
- ASCYLIB is a concurrent-search data-structure library with over 30 implementantions of linked lists, hash tables, skip lists, and binary search trees.
- Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures
- Boost.Lockfree
- CDS C++ library
- https://github.com/khizmax/libcds
- The Concurrent Data Structures (CDS) library is a collection of concurrent containers that don't require external (manual) synchronization for shared access, and safe memory reclamation (SMR) algorithms like Hazard Pointer and user-space RCU. CDS is mostly header-only template library. Only SMR core implementation is segregated to .so/.dll file.
- ConcurrencyFreaks
- Concurrency Kit
- Concurrency primitives, safe memory reclamation mechanisms and non-blocking (including lock-free) data structures designed to aid in the research, design and implementation of high performance concurrent systems.
- https://github.com/concurrencykit/ck
- Concurrent data structures
- moodycamel::ConcurrentQueue (MPMC): https://github.com/cameron314/concurrentqueue
- xenium: a collection of concurrent data structures and memory reclamation algorithms (a header-only library)
Talks
Slides
- COMP 522: Multicore Computing - Presentations
- Concurrency Kit talks
- http://concurrencykit.org/slides.html
- Lock-Free Algorithms: An Introduction, Introduction to Lock-Free Algorithms: Through a case study, Safe Memory Reclamation: Epoch Reclamation, Towards accessible non-blocking technology for C, Fast Bounded-Concurrency Hash Tables
- Memory Management in C++14 and Beyond
- Modern concurrent code in C/C++
- Updating glibc concurrency
- Sarita Adve's Research Group - Talks
Videos
2019
- Atomics, Locks, and Tasks
- Concurrency in C++20 and Beyond
- The C++20 Synchronization Library
- The One-Decade Task: Putting std::atomic in CUDA
- Wait-free data structures and wait-free transactions
- Weak Memory Concurrency in C/C++11
2018
- A “Post-ISA” Era in Computer Systems: Challenges and Opportunities
- The C++ Execution Model
2017
- C++ atomics, from basic to advanced. What do they really do?
- An Interesting Lock-free Queue - Part 2 of N
- Coherence, Consistency, & Déjà vu: Memory Hierarchies in the Era of Specialization
- Multicore Synchronization: The Lesser-Known Primitives
2016
- The speed of concurrency (is lock-free faster?)
2015
- Atomic Counters or A Lesson on Performance and Hardware Concurrency
- Safety: off. How not to shoot yourself in the foot with C++ atomics
- The Dos and Don'ts of Multithreading
- Live Lock-Free or Deadlock (Practical Lock-free Programming)
- C++11/14/17 Atomics the Deep dive: the gory details, before the story consumes you!
- C++ Atomics: The Sad Story of memory_order_consume: A Happy Ending At Last?
- How to make your data structures wait-free for reads
- C++ in the Audio Industry
- Defining Correctness Conditions for Concurrent Objects in Multicore Architectures
- C Concurrency: Still Tricky
- Memory Access Ordering in Complex Embedded Systems
2014
- Lockless programming
- Lock-Free Programming (or, Juggling Razor Blades)
- CppCon 2014; Herb Sutter
- http://herbsutter.com/2014/10/18/my-cppcon-talks-2/
- Part 1: Lazy initialization with DCL vs. call_once vs. function local statics, and lock-free mailbox algorithms
- Part 2: Lock-free linked lists, the ABA problem, and atomic smart pointers
- How Ubisoft Develops Games for Multicore - Before and After C++11
- C++ Memory Model Meets High-Update-Rate Data Structures
- Lock-free by Example
- Blowing up the (C++11) atomic barrier - Optimizing C++11 atomics in LLVM
- The C++ Memory Model
- The Dos and Don'ts of Multithreaded Programming
- The C++ memory model
2013
- Shattering Illusions in Lock-Free Worlds: Compiler/Hardware Behaviors OSes/VMs
- Low Level Threading with C++11
- std::atomic explained
- Everything you always wanted to know about synchronization but were afraid to ask
2012
- atomic Weapons: The C++ Memory Model and Modern Hardware
- Don't Try This at Work -- Low Level Threading with C++11
- Threads and Shared Variables in C++11
- C++11 Threads Surprises
2011
- Lockfree Programming Part 2: Data Structures
2010
- The Basics of Lock-free Programming
Tags:
native
reading
Last modified 02 October 2024