Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regexp.exec is expensive after the es6 branch #201

Closed
mstoykov opened this issue Sep 15, 2020 · 3 comments
Closed

Regexp.exec is expensive after the es6 branch #201

mstoykov opened this issue Sep 15, 2020 · 3 comments

Comments

@mstoykov
Copy link
Contributor

I found one more ... performance problem with the new regexp that is being hit by k6 users ...

Apperantly the way matchAll is being polyfilled is to call exec repeatedly which is fine with small inputs or small calls to it, but with bigger inputs (especially combined with multiple matches) it means that goja will call runes() multitude of times generating the []rune representation of the string a lot :(.

This is even bigger problem for k6 as the way it works is to run multiple (hundreds, sometimes thousands) goja VMs each running the same script, which likely will hit the regex at the same time. One particular case(that provoked this investigation of mine) is that this goes from 200-300 VMs can do this in parallel over half a second to the same can't actually finish doing it over 1-2 minutes and instead eat all the ram :(

You can see result from benchmarks here compared to before es6.

I don't really have any great ideas :D, but here they are in .. some order:

  1. As far as I know the []rune thing is needed in order to run utf16 regexp expressions ... but I would expect most wouldn't actually need that ... or maybe I am just completely misunderstanding the reason. If I am not though, checking whether it is required is likely going to be a lot less expensive ?
  2. If goja is using the runes() method a lot, maybe it will be a good idea to cache it's output? either in the string representation or somewhere else?
  3. exec can possibly do the above - caching it's input and checking it gets the same one.
  4. implementing matchAll will remove at least one of the cases where this is a possible problem, but at least for k6, this was actually provoked by a client code that was just using exec and implementing something like matchAll, so that won't help in that case, but arguably is the most straight forward fix

p.s. I have generally been in the camp of "people are overusing regexes" and this past few issues have not changed that ... but apperantly people just want to use regexes :(

@dop251
Copy link
Owner

dop251 commented Sep 23, 2020

Could you provide some test cases? The heavy usage of runes() seems a bit odd as it would only be used on the first match, all subsequent matches would be done with a regexp2Wrapper which does not use this method.

I've implemented caching of the runes and posMap only to find out that it makes very little difference in the benchmarks. In order for it to be noticeable it has to be a really long string and a really long match string.

@mstoykov
Copy link
Contributor Author

Is the test case provided, not good enough, I can come up with something bigger if needed ;) .

For me at least profilling that benchmarks shows runes() is using the majority of the CPU ... which I expect it won't be the case if it was called only once, and the mem profiling shows allocation object counts of 100k for the same benchmark

Showing top 25 nodes out of 164
      flat  flat%   sum%        cum   cum%
     0.59s 16.71% 16.71%         1s 28.33%  github.com/dop251/goja.asciiString.runes
     0.38s 10.76% 27.48%      0.68s 19.26%  runtime.scanobject
     0.32s  9.07% 36.54%      0.32s  9.07%  runtime.memclrNoHeapPointers
     0.14s  3.97% 40.51%      0.14s  3.97%  runtime.nanotime (inline)
     0.13s  3.68% 44.19%      0.13s  3.68%  runtime.futex
     0.12s  3.40% 47.59%      0.18s  5.10%  runtime.findObject
     0.08s  2.27% 49.86%      0.86s 24.36%  runtime.mallocgc

Looking at the code, every time exec is called it will call execRegexp->findSubmatchIndex->regexp2wrapper.FinsSubmatchIndex->utf16Runes->runes, in the case of an ascii string.

@dop251
Copy link
Owner

dop251 commented Sep 26, 2020

For some reason I completely missed the test case... Anyway, I've pushed a few commits that should bring it on par with es5 (and in some cases improve it).

My benchmarks show this (a695b0c vs master):

name                            old time/op    new time/op    delta
RegexpMatchCache-4                3.50ms ± 2%    3.57ms ± 7%     ~     (p=0.905 n=5+4)
RegexpSplitWithBackRef-4          7.97µs ± 2%    8.76µs ± 3%   +9.94%  (p=0.008 n=5+5)
RegexpSingleExec/Re-ASCII-4       4.18µs ± 2%    1.85µs ± 1%  -55.68%  (p=0.008 n=5+5)
RegexpSingleExec/Re2-ASCII-4      2.94µs ± 1%    2.67µs ± 4%   -9.21%  (p=0.008 n=5+5)
RegexpSingleExec/Re-Unicode-4     4.38µs ± 1%    3.79µs ± 1%  -13.50%  (p=0.008 n=5+5)
RegexpSingleExec/Re2-Unicode-4    3.19µs ± 2%    2.87µs ± 4%  -10.18%  (p=0.008 n=5+5)

name                            old alloc/op   new alloc/op   delta
RegexpMatchCache-4                1.14MB ± 0%    1.41MB ± 0%  +23.43%  (p=0.016 n=5+4)
RegexpSingleExec/Re-ASCII-4         960B ± 0%      773B ± 0%  -19.46%  (p=0.008 n=5+5)
RegexpSingleExec/Re2-ASCII-4      1.46kB ± 0%    1.32kB ± 0%   -9.84%  (p=0.008 n=5+5)
RegexpSingleExec/Re-Unicode-4       984B ± 0%      824B ± 0%  -16.29%  (p=0.000 n=5+4)
RegexpSingleExec/Re2-Unicode-4    1.49kB ± 0%    1.34kB ± 0%   -9.68%  (p=0.008 n=5+5)

name                            old allocs/op  new allocs/op  delta
RegexpMatchCache-4                 22.4k ± 0%     22.4k ± 0%   -0.02%  (p=0.029 n=4+4)
RegexpSingleExec/Re-ASCII-4         18.0 ± 0%      12.0 ± 0%  -33.33%  (p=0.008 n=5+5)
RegexpSingleExec/Re2-ASCII-4        23.0 ± 0%      19.0 ± 0%  -17.39%  (p=0.008 n=5+5)
RegexpSingleExec/Re-Unicode-4       20.0 ± 0%      15.0 ± 0%  -25.00%  (p=0.008 n=5+5)
RegexpSingleExec/Re2-Unicode-4      25.0 ± 0%      21.0 ± 0%  -16.00%  (p=0.008 n=5+5)

@dop251 dop251 closed this as completed Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants