Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit compiler options for tests #290

Open
nuald opened this issue Dec 14, 2020 · 8 comments
Open

Revisit compiler options for tests #290

nuald opened this issue Dec 14, 2020 · 8 comments

Comments

@nuald
Copy link
Collaborator

nuald commented Dec 14, 2020

Some tests use compilation flags which are not used in the production (like -d:danger or -mbranches-within-32B-boundaries or -fbounds-check=off) thus not giving the realistic results. README should be up-to-dated with the proper instructions to not use non-production flags, and the flags themselves should be up-to-dated too.

@beached
Copy link
Contributor

beached commented Dec 14, 2020

The branches within 32B boundaries is the fix for an intel cpu defect that causes inconsistent performance dur to simple changes. #266

@nuald
Copy link
Collaborator Author

nuald commented Dec 14, 2020

@beached That's a tricky one. Officially, it's just a performance regression on Intel processors, not a "defect", and could be eventually fixed by the microcode update. However, I have the latest update for my CPU (released a month ago by Intel), and that flag still shows the performance changes. Moreover, various vendors apply that flag (or rather the required workaround) in their code too, for example - OpenJDK: openjdk/jdk@ccdde49

I hoped that GCC/LLVM will include that flag into the release meta-flag (-O3), but looks like it's not happening, and I'm not sure why. I'd inclined to remove that flag and assume that eventually it'll become default (or Intel provide proper update), but as the same time it will make the tests a little bit unfair as other vendors like Oracle already applied that workaround. @kostya Do you have a second opinion regarding that?

@beached
Copy link
Contributor

beached commented Dec 14, 2020

my understanding was the flag is because the microcode fix causes perf issues when conditionals are not aligned. the benchmarking issue is the difference can be large when something changes out of the benchmarked code because a new statement throws the expressed machine code out of a alignment and the cpu flushes caches.

me too, i thought it would belong in the arch flags as it affects only a class of cpus

@kostya
Copy link
Owner

kostya commented Dec 15, 2020

i think we not removing 32B boundaries as an exception

@dumblob
Copy link

dumblob commented Oct 13, 2021

Thanks @ricvelozo for linking this issue.

How about providing both options in the tests (i.e. one with bounds checked and one without) as proposed in #378 (comment) ?

Bounds checking implementations also differ among languages, so this is definitely something which we want to have included in the benchmark results.

Btw. I woudn't do any exceptions for microcode "mistakes" and would not allow things like -mbranches-within-32B-boundaries.

@beached
Copy link
Contributor

beached commented Oct 13, 2021

Btw. I woudn't do any exceptions for microcode "mistakes" and would not allow things like -mbranches-within-32B-boundaries.

The issue this fixes is that a large majority of intel CPU's cannot do reliable benchmarking and this makes it consistent or did. Prior it would penalize libraries if they happen to generate branch code that sat off a 32byte boundary. So adding/removing code from a library would have random performance impacts.

@dumblob
Copy link

dumblob commented Oct 13, 2021

@beached yes, I understand the consequences.

My point is, that this is rather a precondition for running this "kostya benchmarks" suit (i.e. running it on a CPU which is known to not exhibit this random performance spikes/downs) rather than a "feature" of the individual tests in the benchmark.

@ricvelozo ricvelozo mentioned this issue Mar 10, 2023
@ricvelozo
Copy link
Contributor

In Swift we use the -Ounchecked flag, which disables integer overflow checks and preconditions (release mode asserts). It means some methods can do UB.

In standard release mode -O, just the debug asserts are disabled, and the preconditions abort the program silently on failure. I don't know what optimization level is used in real world apps in Apple Store.

In C++ we use -O3 flag, which in some cases produces wrong assembly or worse performance. Asserts are disabled in release builds, but static_assert aren't (but are compile-time).

In Rust, the standard release mode disables integer overflow checks, but it can be configured separately. The normal asserts cannot be disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants