Ox Compiler Example

/Ox (Enable Most Speed Optimizations); 2 minutes to read +1; In this article. The /Ox compiler option enables a combination of optimizations that favor speed. In some versions of the Visual Studio IDE and the compiler help message, this is called full optimization, but the /Ox compiler option enables only a subset of the speed optimization options enabled by /O2.

Active3 years, 10 months ago

One of the questions that I asked some time ago had undefined behavior, so compiler optimization was actually causing the program to break.

But if there is no undefined behavior in you code, then is there ever a reason not to use compiler optimization? I understand that sometimes, for debugging purposes, one might not want optimized code (please correct me if I am wrong). Other than that, on production code, why not always use compiler optimization?

Also, is there ever a reason to use, say, -O instead of -O2 or -O3?

Community♦

mtahmedmtahmed

2,9856 gold badges28 silver badges41 bronze badges

9 Answers

If there is no undefined behavior, but there is definite broken behavior (either deterministic normal bugs, or indeterminate like race-conditions), it pays to turn off optimization so you can step through your code with a debugger.

Typically, when I reach this kind of state, I like to do a combination of:

debug build (no optimizations) and step through the code
sprinkled diagnostic statements to stderr so I can easily trace the run path

If the bug is more devious, I pull out valgrind and drd, and add unit-tests as needed, both to isolate the problem and ensure that to when the problem is found, the solution works as expected.

In some extremely rare cases, the debug code works, but the release code fails. When this happens, almost always, the problem is in my code; aggressive optimization in release builds can reveal bugs caused by mis-understood lifetimes of temporaries, etc... ...but even in this kind of situation, having a debug build helps to isolate the issues.

In short, there are some very good reasons why professional developers build and test both debug (non-optimized) and release (optimized) binaries. IMHO, having both debug and release builds pass unit-tests at all times will save you a lot of debugging time.

kfmfe04kfmfe04

8,3428 gold badges62 silver badges121 bronze badges

Compiler optimisations have two disadvantages:

Optimisations will almost always rearrange and/or remove code. This will reduce the effectiveness of debuggers, because there will no longer be a 1 to 1 correspondence between your source code and the generated code. Parts of the stack may be missing, and stepping through instructions may end up skipping over parts of the code in counterintuitive ways.
Optimisation is usually expensive to perform, so your code will take longer to compile with optimisations turned on than otherwise. It is difficult to do anything productive while your code is compiling, so obviously shorter compile times are a good thing.

Some of the optimisations performed by -O3 can result in larger executables. This might not be desirable in some production code.

Another reason to not use optimisations is that the compiler that you are using may contain bugs that only exist when it is performing optimisation. Compiling without optimisation can avoid those bugs. If your compiler does contain bugs, a better option might be to report/fix those bugs, to change to a better compiler, or to write code that avoids those bugs completely.

If you want to be able to perform debugging on the released production code, then it might also be a good idea to not optimise the code.

MankarseMankarse

32.9k8 gold badges80 silver badges133 bronze badges

3 Reasons

It confuses the debugger, sometimes
It's incompatible with some code patterns
Not worth it: slow or buggy, or takes too much memory, or produces code that's too big.

In case 2, imagine some OS code that deliberately changes pointer types. The optimizer can assume that objects of the wrong type could not be referenced and generate code that aliases changing memory values in registers and gets the 'wrong'¹ answer.

Case 3 is an interesting concern. Sometimes optimizers make code smaller but sometimes they make it bigger. Most programs are not the least bit CPU-bound and even for the ones that are, only 10% or less of the code is actually computationally-intensive. If there is any downside at all to the optimizer then it is only a win for less than 10% of a program.

If the generated code is larger, then it will be less cache-friendly. This might be worth it for a matrix algebra library with O(n³) algorithms in tiny little loops. But for something with more typical time complexity, overflowing the cache might actually make the program slower. Optimizers can be tuned for all this stuff, typically, but if the program is a web application, say, it would certainly be more developer-friendly if the compiler would just do the all-purpose things and allow the developer to just not open the fancy-tricks Pandora's box.

^{1. Such programs are usually not standard-conforming so the optimizer is technically 'correct', but still not doing what the developer intended.}

DigitalRossDigitalRoss

127k20 gold badges214 silver badges301 bronze badges

The reason is that you develop one application (debug build) and your customers run completely different application (release build). If testing resources are low and/or compiler used is not very popular, I would disable optimization for release builds.

MS publishes numerous hotfixes for optimization bugs in their MSVC x86 compiler. Fortunately, I've never encountered one in real life. But this was not the case with other compilers. SH4 compiler in MS Embedded Visual C++ was very buggy.

Vyacheslav LanovetsVyacheslav Lanovets

ldav1sldav1s

13.6k2 gold badges40 silver badges49 bronze badges

Just happened to me. The code generated by swig for interfacing Java is correct but won't work with -O2 on gcc.

David FeurleDavid Feurle

Two big reasons that I have seen arise from floating point math, and overly aggressive inlining. The former is caused by the fact that floating point math is extremely poorly defined by the C++ standard. Many processors perform calculations using 80-bits of precision, for instance, only dropping down to 64-bits when the value is put back into main memory. If a version of a routine flushes that value to memory frequently, while another only grabs the value once at the end, the results of the calculations can be slightly different. Just tweaking the optimizations for that routine may well be a better move than refactoring the code to be more robust to the differences.

Inlining can be problematic because, by its very nature, it generally results in larger object files. Perhaps this increase is code size is unacceptable for practical reasons: it needs to fit on a device with limited memory, for instance. Or perhaps the increase in code size results in the code being slower. If it a routine becomes big enough that it no longer fits in cache, the resultant cache misses can quickly outweigh the benefits inlining provided in the first place.

I frequently hear of people who, when working in a multi-threaded environment, turn off debugging and immediately encounter hordes of new bugs due to newly uncovered race conditions and whatnot. The optimizer just revealed the underlying buggy code here, though, so turning it off in response is probably ill advised.

Dennis ZickefooseDennis Zickefoose

9,1103 gold badges25 silver badges36 bronze badges

There is an example, why sometimes is dangerous using optimization flag and our tests should cover most of the code to notice such an error.

Using clang (because in gcc even without optimization flag, makes some iptimizations and the output is corrupted):

File: a.cpp

Without -Ox flag:

> clang --output withoutOptimization a.cpp; ./withoutOptimization

>Goodbye!

With -Ox flag:

> clang --output withO1 -O1 a.cpp; ./withO1

>Hello, world!

BenjaminBenjamin

1,8012 gold badges19 silver badges28 bronze badges

One example is short-circuit boolean evaluation. Something like:

A 'smart' compiler might realize that someFunc will always return false for some reason, making the entire statement evaluate to false, and decide to not call otherFunc to save CPU time. But if otherFunc contains some code that directly affects program execution (maybe it resets a global flag or something), it now won't perform that step and you program enters an unknown state.

Marc BMarc B

322k31 gold badges340 silver badges439 bronze badges

9 Answers

3 Reasons

Not the answer you're looking for? Browse other questions tagged compiler-constructioncompiler-optimization or ask your own question.