Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: xz tarball extreme compression #10626

Closed
wants to merge 1 commit into from
Closed

build: xz tarball extreme compression #10626

wants to merge 1 commit into from

Conversation

PeterDaveHello
Copy link
Member

@PeterDaveHello PeterDaveHello commented Jan 5, 2017

Take node v7.4.0 as example, tarball size improvements listed as below:

node-v7.4.0-darwin-x64.tar.xz     9176904 ->  9147884 (99.68%)
node-v7.4.0-headers.tar.xz         351224 ->   349612 (99.54%)
node-v7.4.0-linux-arm64.tar.xz    9271000 ->  9254748 (99.82%)
node-v7.4.0-linux-armv6l.tar.xz   9243504 ->  9227428 (99.82%)
node-v7.4.0-linux-armv7l.tar.xz   9246228 ->  9228732 (99.81%)
node-v7.4.0-linux-ppc64.tar.xz    9448476 ->  9411128 (99.60%)
node-v7.4.0-linux-ppc64le.tar.xz  9553876 ->  9521424 (99.66%)
node-v7.4.0-linux-s390x.tar.xz    9923212 ->  9901772 (99.78%)
node-v7.4.0-linux-x64.tar.xz     10318700 -> 10304396 (99.86%)
node-v7.4.0-linux-x86.tar.xz      9907848 ->  9886448 (99.78%)
node-v7.4.0-sunos-x86.tar.xz      9742620 ->  9732160 (99.89%)
node-v7.4.0.tar.xz               16611356 -> 16459192 (99.08%)

So that we can know that we can have the improvement on all the xz
tarball releases!

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

build

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. lts-watch-v6.x labels Jan 5, 2017
@mscdex
Copy link
Contributor

mscdex commented Jan 5, 2017

According to the xz manpage, using -e can increase the compression time "dramatically (it can easily double)."

Is that really worth it for <=1% difference?

@PeterDaveHello
Copy link
Member Author

PeterDaveHello commented Jan 5, 2017

@mscdex I think it depends the download counts on the server, personally I don't feel the time increase so much, take node-v7.4.0-linux-x64.tar as example on my CPU E3-1220 V2 CPU:

$ time xz -9 node-v7.4.0-linux-x64.tar 

real    0m20.632s
user    0m20.432s
sys     0m0.192s

$ time xz -9e node-v7.4.0-linux-x64.tar                                   

real    0m27.529s
user    0m27.378s
sys     0m0.124s

The time increases 35% more, but still only few (7) seconds 😄

@gibfahn
Copy link
Member

gibfahn commented Jan 5, 2017

So the average size decrease is 0.31%? I'm not sure how useful that'd be.

@PeterDaveHello
Copy link
Member Author

PeterDaveHello commented Jan 5, 2017

It's not that significant improvement, but that 7 secs(on my computer, may be even shorter on nodejs's build server) could save more than 7 secs in total around the world if there are enough downloads count, and I guess it'll 😄

Since the tarball is static, once it's released, it'll not be touched anymore in the most cases, and could be download by millions of times (depends on the time window), even if it can only save ~ 30KB, 30KB x 1000000 = ~ 3GB, so we can every 3GB bandwidth from every million times download, from a large scale of view, the ~ 7 secs may worth it?

@gibfahn
Copy link
Member

gibfahn commented Jan 5, 2017

@PeterDaveHello So it's 7 seconds more on your computer to compress? How much longer does it take to extract?

@PeterDaveHello
Copy link
Member Author

@gibfahn I didn't see significant difference from the decompression time, 1.287 secs vs 1.281 secs.

@jasnell
Copy link
Member

jasnell commented Mar 24, 2017

Updates on this one?

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Mar 24, 2017
@PeterDaveHello
Copy link
Member Author

Same question :)

@bnoordhuis
Copy link
Member

Decompression time should be the same if I understand xz's 'extreme' algorithm correctly, but the resulting file could be either bigger or smaller than without -e.

It would be interesting to see if our tarballs are persistently smaller across releases or if it's hit and miss. Any volunteers?

@PeterDaveHello
Copy link
Member Author

@bnoordhuis across all released versions and different architecture?

@bnoordhuis
Copy link
Member

All releases might be a bit excessive but I'd be curious to see the numbers for a few releases from the v4.x, v6.x and v7.x release branches each.

@PeterDaveHello
Copy link
Member Author

PeterDaveHello commented Mar 26, 2017

v7 already been tested as above, I'll test v4 & v6 later

@PeterDaveHello
Copy link
Member Author

node-v4.8.1-darwin-x64.tar.xz            7124752 ->  7149424 (99.6549%)
node-v4.8.1-headers.tar.xz                338352 ->   339964 (99.5258%)
node-v4.8.1-linux-arm64.tar.xz           7550036 ->  7560260 (99.8648%)
node-v4.8.1-linux-armv6l.tar.xz          7455440 ->  7470656 (99.7963%)
node-v4.8.1-linux-armv7l.tar.xz          7458396 ->  7468584 (99.8636%)
node-v4.8.1-linux-ppc64le.tar.xz         7720644 ->  7742768 (99.7143%)
node-v4.8.1-linux-ppc64.tar.xz           7596784 ->  7624740 (99.6334%)
node-v4.8.1-linux-x64.tar.xz             8307500 ->  8318712 (99.8652%)
node-v4.8.1-linux-x86.tar.xz             7932348 ->  7944156 (99.8514%)
node-v4.8.1-sunos-x64.tar.xz             8492692 ->  8520216 (99.6777%)
node-v4.8.1-sunos-x86.tar.xz             7854796 ->  7890112 (99.5524%)
node-v4.8.1.tar.xz                      13155440 -> 13293124 (98.9642%)

@PeterDaveHello
Copy link
Member Author

node-v6.10.1-darwin-x64.tar.xz     8320216 ->  8290400 (99.6416%)
node-v6.10.1-headers.tar.xz         347024 ->   345344 (99.5159%)
node-v6.10.1-linux-arm64.tar.xz    8432120 ->  8422480 (99.8857%)
node-v6.10.1-linux-armv6l.tar.xz   8312748 ->  8287768 (99.6995%)
node-v6.10.1-linux-armv7l.tar.xz   8320636 ->  8295836 (99.7019%)
node-v6.10.1-linux-ppc64le.tar.xz  8674404 ->  8646288 (99.6759%)
node-v6.10.1-linux-ppc64.tar.xz    8536804 ->  8506192 (99.6414%)
node-v6.10.1-linux-s390x.tar.xz    8955096 ->  8924860 (99.6624%)
node-v6.10.1-linux-x64.tar.xz      9361912 ->  9347648 (99.8476%)
node-v6.10.1-linux-x86.tar.xz      8956748 ->  8941476 (99.8295%)
node-v6.10.1-sunos-x64.tar.xz      9540948 ->  9519380 (99.7739%)
node-v6.10.1-sunos-x86.tar.xz      8828132 ->  8844696 (1.00188%)
node-v6.10.1.tar.xz               15746356 -> 15590420 (99.0097%)

@bnoordhuis
Copy link
Member

Thanks, @PeterDaveHello. Okay, so it's a win most of the time. Let's enable it but can you add it to the XZ_COMPRESSION variable on (or around) line 600 instead of passing it manually everywhere?

Take node v7.4.0 as example, tarball size improvements listed as below:

node-v7.4.0-darwin-x64.tar.xz     9176904 ->  9147884 (99.68%)
node-v7.4.0-headers.tar.xz         351224 ->   349612 (99.54%)
node-v7.4.0-linux-arm64.tar.xz    9271000 ->  9254748 (99.82%)
node-v7.4.0-linux-armv6l.tar.xz   9243504 ->  9227428 (99.82%)
node-v7.4.0-linux-armv7l.tar.xz   9246228 ->  9228732 (99.81%)
node-v7.4.0-linux-ppc64.tar.xz    9448476 ->  9411128 (99.60%)
node-v7.4.0-linux-ppc64le.tar.xz  9553876 ->  9521424 (99.66%)
node-v7.4.0-linux-s390x.tar.xz    9923212 ->  9901772 (99.78%)
node-v7.4.0-linux-x64.tar.xz     10318700 -> 10304396 (99.86%)
node-v7.4.0-linux-x86.tar.xz      9907848 ->  9886448 (99.78%)
node-v7.4.0-sunos-x86.tar.xz      9742620 ->  9732160 (99.89%)
node-v7.4.0.tar.xz               16611356 -> 16459192 (99.08%)

So that we can know that we can have the improvement on all the xz
tarball releases!
@PeterDaveHello
Copy link
Member Author

@bnoordhuis done.

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM. Probably no point in running the CI, I don't think it tests tarball creation.

@fhinkel
Copy link
Member

fhinkel commented May 26, 2017

Note to self (or any other collaborator): This is ready to merge, removing the stalled label.

@fhinkel fhinkel removed the stalled Issues and PRs that are stalled. label May 26, 2017
@PeterDaveHello
Copy link
Member Author

Thanks @fhinkel

refack pushed a commit to refack/node that referenced this pull request May 27, 2017
PR-URL: nodejs#10626
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Franziska Hinkelmann <[email protected]>
@refack
Copy link
Contributor

refack commented May 27, 2017

Landed in 1474b7a

@refack refack closed this May 27, 2017
@PeterDaveHello PeterDaveHello deleted the improve-xz-compression branch May 27, 2017 22:33
jasnell pushed a commit that referenced this pull request May 28, 2017
PR-URL: #10626
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Franziska Hinkelmann <[email protected]>
@jasnell jasnell mentioned this pull request May 28, 2017
@gibfahn gibfahn mentioned this pull request Jun 15, 2017
3 tasks
@MylesBorins
Copy link
Contributor

landed on v6.x, let me know if it should be backed out

MylesBorins pushed a commit that referenced this pull request Jul 17, 2017
PR-URL: #10626
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Franziska Hinkelmann <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Jul 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants