LLVM loop optimization can make safe programs crash #28728
|
The LLVM IR of the optimised code is ; Function Attrs: noreturn nounwind readnone uwtable
define internal void @_ZN4main20h5ec738167109b800UaaE() unnamed_addr #0 {
entry-block:
unreachable
}This kind of optimisation breaks the main assumption that should normally hold on uninhabited types: it should be impossible to have a value of that type. |
|
triage: I-nominated Seems bad! If LLVM doesn't have a way to say "yes, this loop really is infinite" though then we may just have to sit-and-wait for the upstream discussion to settle. |
|
A way to prevent infinite loops from being optimised away is to add |
|
Is this related to #18785? That one's about infinite recursion to be UB, but it sounds like the fundamental cause might be similar: LLVM doesn't consider not halting to be a side effect, so if a function has no side effects other than not halting, it's happy to optimize it away. |
|
Yes, looks like it's the same. Further down that issue, they show how to get |
|
Crash, or, possibly even worse heartbleed https://play.rust-lang.org/?gist=15a325a795244192bdce&version=stable |
|
So I've been wondering how long until somebody reports this. :) In my opinion, the best solution would of course be if we could tell LLVM not to be so aggressive about potentially infinite loops. Otherwise, the only thing I think we can do is to do a conservative analysis in Rust itself that determines whether:
Either of this should be enough to avoid undefined behavior. |
|
triage: P-medium We'd like to see what LLVM will do before we invest a lot of effort on our side, and this seems relatively unlikely to cause problems in practice (though I have personally hit this while developing the compiler as well). There are no backwards incomatibility issues to be concerned about. |
|
Quoting from the LLVM mailing list discussion:
|
|
@dotdash The excerpt you are quoting comes from the C++ specification; it is basically the answer to "how it [having side effects] is defined in C" (also confirmed by the standard committee: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm ). Regarding what is the expected behaviour of the LLVM IR there is some confusion. https://llvm.org/bugs/show_bug.cgi?id=24078 shows that there seems to be no accurate & explicit specification of the semantics of infinite loops in LLVM IR. It aligns with the semantics of C++, most likely for historical reasons and for convenience (I only managed to track down https://groups.google.com/forum/#!topic/llvm-dev/j2vlIECKkdE which apparently refers to a time when infinite loops were not optimised away, some time before the C/C++ specs were updated to allow it). From the thread it is clear that there is the desire to optimise C++ code as effectively as possible (i.e. also taking into account the opportunity to remove infinite loops), but in the same thread several developers (including some that actively contribute to LLVM) have shown interest in the ability to preserve infinite loops, as they are needed for other languages. |
|
@ranma42 I'm aware of that, I just quoted that for reference, because one possibility to work-around this would be to detect such loops in rust and add one of the above to it to stop LLVM from performing this optimization. |
|
Yes, following @ranma42's example, this way shows how it readily defeats array bounds checks. playground link |
|
The policy is that wrong-code issues that are also soundness issues (i.e. most of them) should be tagged |
|
So just to recap prior discussion, there are really two choices here that I can see:
The latter is kind of bad because it can inhibit optimization, so we'd want to do it somewhat sparingly -- basically wherever we can't prove termination ourselves. You could also imaging tying it a bit more to how LLVM optimizes -- i.e., introducing only if we can detect a scenario that LLVM might consider to be an infinite loop/recursion -- but that would (a) require tracking LLVM and (b) require deeper knowledge than I, at least, possess. |
|
side-note: |
|
Also, note that this is invalid for C. LLVM making this argument means that there is a bug in clang. void foo() { while (1) { } }
void create_null() {
foo();
int i = 0;
while (i < 100) { i += 1; }
}
__attribute__((noreturn))
void use_null() {
__builtin_unreachable();
}
int main() {
create_null();
use_null();
}This crashes with optimizations; this is invalid behavior under the C11 standard:
Note the "whose controlling expression is not a constant expression" - |
|
Did you find a bug report for that in LLVM's bugzilla or filled one? It seems that in C++ infinite loops that can never terminate are undefined behavior, but in C they are defined behavior (either they can be safely removed in some cases, or they cannot in others). |
|
@gnzlbg I'm filing a bug now. |
The following snippet crashes when compiled in release mode on current stable, beta and nightly:
https://play.rust-lang.org/?gist=1f99432e4f2dccdf7d7e&version=stable
This is based on the following example of LLVM removing a loop that I was made aware of: https://github.com/simnalamburt/snippets/blob/master/rust/src/bin/infinite.rs.
What seems to happen is that since C allows LLVM to remove endless loops that have no side-effect, we end up executing a
matchthat has to arms.