Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert simple byte handlers in lexer to branchless code #3292

Open
overlookmotel opened this issue May 15, 2024 · 0 comments
Open

Convert simple byte handlers in lexer to branchless code #3292

overlookmotel opened this issue May 15, 2024 · 0 comments
Labels
A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance E-Help Wanted Experience level - For the experienced collaborators

Comments

@overlookmotel
Copy link
Collaborator

overlookmotel commented May 15, 2024

Why the lexer is slower than it needs to be

Many of the simpler byte handlers in the lexer perform multiple branches.

e.g. The handler for = which needs to figure out "is the whole token =, ==, ===, or =>?"

// =
ascii_byte_handler!(EQL(lexer) {
lexer.consume_char();
if lexer.next_eq('=') {
if lexer.next_eq('=') {
Kind::Eq3
} else {
Kind::Eq2
}
} else if lexer.next_eq('>') {
Kind::Arrow
} else {
Kind::Eq
}
});

Typical JS code will contain a random mix of assignments x = 1, equality checks x === 1, and arrow functions =>.

Because of the random distribution, the branch predictor won't be able to predict what the token is going to be, and so which way these branches are going to go. It will likely get it wrong at least 50% of the time. Branch mis-predication is fairly expensive (I think I read about 20 CPU cycles each time), and these functions are on a hot path because =, ===, and => are all common in JS code.

Make it branchless!

We could convert these matchers to branchless code with something like this:

// Set the discriminants for `Kind` to always follow the pattern
// that `discriminant % 4` is number of bytes this token takes
const EQ_BASE: u8 = 128; // or whatever multiple of 4

#[derive(PartialEq, Clone, Copy)]
#[repr(u8)]
pub enum Kind {
    // `=`
    Eq = EQ_BASE + 1,
    // `==`
    Eq2 = EQ_BASE + 2,
    // `===`
    Eq3 = EQ_BASE + 3,
    // `=>`
    Arrow = EQ_BASE + 4 + 2,

    // ...all the other token kinds
}

ascii_byte_handler!(EQL(lexer) { 
    const fn num_bytes(kind: Kind) -> usize { kind as usize % 4 }

    // Check the invariant about discrimants mentioned above at compile time
    const_assert!(num_bytes(Kind::Eq) == 1);
    const_assert!(num_bytes(Kind::Eq2) == 2);
    const_assert!(num_bytes(Kind::Eq3) == 3);
    const_assert!(num_bytes(Kind::Arrow) == 2);

    // NB: For simplicity, not handling case where close to EOF
    // and there are not 2 more bytes remaining in source to read.
    // Would need to handle that case separately, in a `#[cold]` branch.

    let next2: [u8; 2] = unsafe { lexer.source.position().add(1).read2() };
    let kind = if next2 == [b'=', b'='] {
        Kind::Eq3
    } else {
        match next2[0] {
            b'=' => Kind::Eq2,
            b'>' => Kind::Arrow,
            _ => Kind::Eq,
        }
    };

    unsafe { lexer.source.advance(num_bytes(kind)); }

    kind
});

The trick is to make Kind play two roles. It's the token kind, but it also encodes how many bytes the token is. This is what allows to compiler to use cmovne instructions rather than branches.

Godbolt shows this compiles down to branchless code: https://godbolt.org/z/Me5Pvj5cP

Note on benchmarking

We need to alter the benchmarking setup before we do this. CodSpeed have advised that the way their benchmarking works, it doesn't take into account branch mis-prediction.

So, even if this change makes a solid difference in real world (which I think it could), it won't show up on our CodSpeed benchmarks.

We need some way to check that this is improving performance in the way that we think it is.

@overlookmotel overlookmotel added A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance labels May 15, 2024
@Boshen Boshen added the E-Help Wanted Experience level - For the experienced collaborators label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance E-Help Wanted Experience level - For the experienced collaborators
Projects
None yet
Development

No branches or pull requests

2 participants