Even if compilers don't reorder them, the underlying hardware might do it (or might make it seem to other cores as if it had), because that can sometimes make the code run faster.
However, the use of std::atomic
s imposes restrictions on how code can be reordered, and one such restriction is that no code that, in the source code, precedes a write of a std::atomic
variable may take place (or appear to other cores to take place) afterwards. [20] This is true only for std::atomic s using sequential consistency , which is both the default and the only consistency model for std::atomic objects that use the syntax shown in this book. C++11 also supports consistency models with more flexible code-reordering rules. Such weak (aka relaxed ) models make it possible to create software that runs faster on some hardware architectures, but the use of such models yields software that is much more difficult to get right, to understand, and to maintain. Subtle errors in code using relaxed atomics is not uncommon, even for experts, so you should stick to sequential consistency if at all possible.
That means that in our code,
auto imptValue = computeImportantValue(); // compute value
valAvailable = true; // tell other task
// it's available
not only must compilers retain the order of the assignments to imptValue
and valAvailable
, they must generate code that ensures that the underlying hardware does, too. As a result, declaring valAvailable
as std::atomic
ensures that our critical ordering requirement — imptValue
must be seen by all threads to change no later than valAvailable
does — is maintained.
Declaring valAvailable
as volatile
doesn't impose the same code reordering restrictions:
volatilebool valAvailable(false);
auto imptValue = computeImportantValue();
valAvailable = true; // other threads might see this assignment
// before the one to imptValue!
Here, compilers might flip the order of the assignments to imptValue
and valAvailable
, and even if they don't, they might fail to generate machine code that would prevent the underlying hardware from making it possible for code on other cores to see valAvailable
change before imptValue
.
These two issues — no guarantee of operation atomicity and insufficient restrictions on code reordering — explain why volatile
's not useful for concurrent programming, but it doesn't explain what it is useful for. In a nutshell, it's for telling compilers that they're dealing with memory that doesn't behave normally.
“Normal” memory has the characteristic that if you write a value to a memory location, the value remains there until something overwrites it. So if I have a normal int
,
int x;
and a compiler sees the following sequence of operations on it,
auto y = x; // read x
y = x; // read x again
the compiler can optimize the generated code by eliminating the assignment to y
, because it's redundant with y
's initialization.
Normal memory also has the characteristic that if you write a value to a memory location, never read it, and then write to that memory location again, the first write can be eliminated, because it was never used. So given these two adjacent statements,
x = 10; // write x
x = 20; // write x again
compilers can eliminate the first one. That means that if we have this in the source code,
auto y = x; // read x
y = x; // read x again
x = 10; // write x
x = 20; // write x again
compilers can treat it as if it had been written like this:
auto y = x; // read x
x = 20; // write x
Lest you wonder who'd write code that performs these kinds of redundant reads and superfluous writes (technically known as redundant loads and dead stores ), the answer is that humans don't write it directly — at least we hope they don't. However, after compilers take reasonable-looking source code and perform template instantiation, inlining, and various common kinds of reordering optimizations, it's not uncommon for the result to have redundant loads and dead stores that compilers can get rid of.
Such optimizations are valid only if memory behaves normally. “Special” memory doesn't. Probably the most common kind of special memory is memory used for memory-mapped I/O . Locations in such memory actually communicate with peripherals, e.g., external sensors or displays, printers, network ports, etc. rather than reading or writing normal memory (i.e., RAM). In such a context, consider again the code with seemingly redundant reads:
auto y = x; // read x
y = x; // read x again
If x
corresponds to, say, the value reported by a temperature sensor, the second read of x
is not redundant, because the temperature may have changed between the first and second reads.
It's a similar situation for seemingly superfluous writes. In this code, for example,
x = 10; // write x
x = 20; // write x again
if x
corresponds to the control port for a radio transmitter, it could be that the code is issuing commands to the radio, and the value 10 corresponds to a different command from the value 20. Optimizing out the first assignment would change the sequence of commands sent to the radio.
volatile
is the way we tell compilers that we're dealing with special memory. Its meaning to compilers is “Don't perform any optimizations on operations on this memory.” So if x
corresponds to special memory, it'd be declared volatile
:
volatileint x;
Consider the effect that has on our original code sequence:
auto y = x; // read x
y = x; // read x again ( can't be optimized away )
x = 10; // write x ( can't be optimized away )
x = 20; // write x again
This is precisely what we want if x
is memory-mapped (or has been mapped to a memory location shared across processes, etc.).
Pop quiz! In that last piece of code, what is y
's type: int
or volatile int
? [21] y 's type is auto -deduced, so it uses the rules described in Item 2 . Those rules dictate that for the declaration of non-reference non-pointer types (which is the case for y ), const and volatile qualifiers are dropped. y 's type is therefore simply int . This means that redundant reads of and writes to y can be eliminated. In the example, compilers must perform both the initialization of and the assignment to y , because x is volatile , so the second read of x might yield a different value from the first one.
The fact that seemingly redundant loads and dead stores must be preserved when dealing with special memory explains, by the way, why std::atomic
s are unsuitable for this kind of work. Compilers are permitted to eliminate such redundant operations on std::atomic
s. The code isn't written quite the same way it is for volatile
s, but if we overlook that for a moment and focus on what compilers are permitted to do, we can say that, conceptually, compilers may take this,
std::atomic<int >x;
auto y = x; // conceptually read x (see below)
Читать дальше