Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: reduce deplicated codes in autoEscapeStr #18613

Closed
wants to merge 5 commits into from

Conversation

starkwang
Copy link
Contributor

@starkwang starkwang commented Feb 7, 2018

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

url

@nodejs-github-bot nodejs-github-bot added the url Issues and PRs related to the legacy built-in url module. label Feb 7, 2018
@starkwang
Copy link
Contributor Author

@BridgeAR
Copy link
Member

BridgeAR commented Feb 7, 2018

@starkwang I guess it might have a minor performance penalty but without having a significant impact. Would you be so kind and provide some benchmarks for it nevertheless? :-)

@starkwang
Copy link
Contributor Author

@BridgeAR Pushed commit to fix performance regression. Here is the benchmark:

                                             confidence improvement accuracy (*)    (**)   (***)
 url/url-parse-.js n=1000000 type="escaped"          *     -4.06 %      ±3.29% ±4.38% ±5.70%
 url/url-parse-.js n=1000000 type="normal"           *      2.04 %      ±1.64% ±2.18% ±2.83%

@starkwang
Copy link
Contributor Author

lib/url.js Outdated
34: '%22', 39: '%27', 60: '%3C', 62: '%3E',
92: '%5C', 94: '%5E', 96: '%60', 123: '%7B',
124: '%7C', 125: '%7D'
};
function autoEscapeStr(rest) {
var escaped = '';
var lastEscapedPos = 0;
for (var i = 0; i < rest.length; ++i) {
// Manual switching is faster than using a Map/Object.
Copy link
Member

@joyeecheung joyeecheung Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the whole point of having these switch statements to avoid the overhead of accessing a Map/object? (although that looks quite Crankshaftscript-ish so I doubt if the comment here still holds)...if we are going to access an object anyway, then the switch statements can just be replaced with something like const code = escapedCharacterCodes[rest.charCodeAt(i)]; if (code)..., and this comment can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what I do in the first commit, but it causes ~10% performance regression in url.parse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmeurer Would you be interested in taking a look at this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively I'd suggest to go with an array instead of an object literal. Did you try with an array?

@joyeecheung
Copy link
Member

joyeecheung commented Feb 7, 2018

Benchmark CI: https://ci.nodejs.org/job/benchmark-node-micro-benchmarks/117/
Ah, right, the benchmark CI cannot compare new benchmarks..

@starkwang
Copy link
Contributor Author

@bmeurer @joyeecheung I tried to use array and found it was faster. Here is the benchmark with array:

                                                confidence improvement accuracy (*)    (**)   (***)
 url/url-parse.js n=10000000 type="escaped"          *     -2.80 %      ±2.67% ±3.55% ±4.63%
 url/url-parse.js n=10000000 type="normal"         ***      1.94 %      ±0.77% ±1.03% ±1.36%

lib/url.js Outdated
const escapedCodesArr = new Array();
for (var key in escapedCodes) {
escapedCodesArr[key] = escapedCodes[key];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a holey array as fast as a dense array? I expect it would be better to add empty entries instead of having holes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that dense array is still faster than holey array: https://jsperf.com/packed-vs-holey-arrays

But the comparison above is between the array case and the original switch case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be so kind and switch to the dense array in that case? :-)

I guess we do not need a new benchmark in that case even though it would of course still be nice to have the results with those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't fully understand. The code of chars in rest is not dense, so the array used to map them is not dense as well. Changing it to a dense array may cost a lot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is to use it like e.g. util.js.

@starkwang
Copy link
Contributor Author

@starkwang
Copy link
Contributor Author

Landed in 3cef3e6

@starkwang starkwang closed this Feb 23, 2018
starkwang added a commit that referenced this pull request Feb 23, 2018
addaleax pushed a commit to addaleax/node that referenced this pull request Feb 26, 2018
addaleax pushed a commit that referenced this pull request Feb 26, 2018
@addaleax addaleax mentioned this pull request Feb 27, 2018
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
url Issues and PRs related to the legacy built-in url module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants