Fast path for parsing decimal literals #3288

overlookmotel · 2024-05-15T08:11:46Z

#3283 made an optimization to the parser to avoid checking for _s in numeric literals twice, and got a nice 1% speed-up on parser benchmarks.

It occurs to me that we could repeat the trick for the common case of simple decimal literals (e.g. 0, 1, 123). str::parse::<f64>() is quite a complicated function which checks for all kinds of things (exponent, decimal point, valid input etc). All of this is repeated work, since we've already determined in the lexer what the input is.

Perhaps we could have a flag which lexer sets if the input is a simple case, and then use a hand-rolled parse_simple_decimal function to convert to f64, like the ones we have for binary and octal?

oxc/crates/oxc_parser/src/lexer/number.rs

Lines 76 to 88 in f38d138

    
           fn parse_octal(s: &str) -> f64 { 
        
               debug_assert!(!s.is_empty()); 
        
               let mut result = 0_f64; 
        
               for c in s.as_bytes().iter().filter(|s| s != &&b'_') { 
        
                   #[allow(clippy::cast_lossless)] 
        
                   let value = (c - b'0') as f64; 
        
                   result = result.mul_add(8.0, value); 
        
               } 
        
               result 
        
           }

(we should base implementation on std's str::parse::<f64> as it's probably very optimized already, but just cut out all the extraneous checks)

Given that #3283 got a 1% speed-up, maybe this can get the same again, or maybe more?

@DonIsaac Would you be interested in investigating?

The text was updated successfully, but these errors were encountered:

overlookmotel · 2024-05-15T08:30:59Z

As I mentioned in #3289, it's mysterious why #3283 got a 5% speed bump on lexer benchmarks.

Before we test out what I'm suggesting in this issue, we should investigate that further. If it turns out that converting the padding bytes in Token from u8 to bool is a speed-gain in itself, we should do that first, so then we can measure the impact of the change described here in isolation (as it will also involve using one of those padding bytes as a bool).

DonIsaac · 2024-05-15T18:12:28Z

Just did something similar for non-decimals in #3296. Decimals are up next

DonIsaac · 2024-05-16T15:01:49Z

@overlookmotel I discovered that Kind::Decimals are never floating point numbers. However, when I tried parsing them as an i64 then casting the result to an f64, I found no performance improvements. However, my local benchmarking is not always accurate, so I'll make a small PR so we can see bench results from CI

overlookmotel added the C-performance Category - Solution not expected to change functional behavior, only performance label May 15, 2024

overlookmotel assigned Boshen May 15, 2024

Boshen removed their assignment May 15, 2024

overlookmotel added the A-parser Area - Parser label May 15, 2024

DonIsaac self-assigned this May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast path for parsing decimal literals #3288

Fast path for parsing decimal literals #3288

overlookmotel commented May 15, 2024 •

edited

overlookmotel commented May 15, 2024 •

edited

DonIsaac commented May 15, 2024

DonIsaac commented May 16, 2024

Fast path for parsing decimal literals #3288

Fast path for parsing decimal literals #3288

Comments

overlookmotel commented May 15, 2024 • edited

overlookmotel commented May 15, 2024 • edited

DonIsaac commented May 15, 2024

DonIsaac commented May 16, 2024

overlookmotel commented May 15, 2024 •

edited

overlookmotel commented May 15, 2024 •

edited