Convert simple byte handlers in lexer to branchless code #3292
Labels
A-parser
Area - Parser
C-performance
Category - Solution not expected to change functional behavior, only performance
E-Help Wanted
Experience level - For the experienced collaborators
Why the lexer is slower than it needs to be
Many of the simpler byte handlers in the lexer perform multiple branches.
e.g. The handler for
=
which needs to figure out "is the whole token=
,==
,===
, or=>
?"oxc/crates/oxc_parser/src/lexer/byte_handlers.rs
Lines 389 to 403 in b9d69ad
Typical JS code will contain a random mix of assignments
x = 1
, equality checksx === 1
, and arrow functions=>
.Because of the random distribution, the branch predictor won't be able to predict what the token is going to be, and so which way these branches are going to go. It will likely get it wrong at least 50% of the time. Branch mis-predication is fairly expensive (I think I read about 20 CPU cycles each time), and these functions are on a hot path because
=
,===
, and=>
are all common in JS code.Make it branchless!
We could convert these matchers to branchless code with something like this:
The trick is to make
Kind
play two roles. It's the token kind, but it also encodes how many bytes the token is. This is what allows to compiler to usecmovne
instructions rather than branches.Godbolt shows this compiles down to branchless code: https://godbolt.org/z/Me5Pvj5cP
Note on benchmarking
We need to alter the benchmarking setup before we do this. CodSpeed have advised that the way their benchmarking works, it doesn't take into account branch mis-prediction.
So, even if this change makes a solid difference in real world (which I think it could), it won't show up on our CodSpeed benchmarks.
We need some way to check that this is improving performance in the way that we think it is.
The text was updated successfully, but these errors were encountered: