Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace KB unit with KiB #4293

Merged
merged 1 commit into from
Nov 8, 2017
Merged

Replace KB unit with KiB #4293

merged 1 commit into from
Nov 8, 2017

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Nov 6, 2017

kB (kilo byte) unit means 1000 bytes, whereas KiB ("kibibyte")
means 1024 bytes. KB was misused: replace kB or KB with KiB when
appropriate.

Same change for MB and GB which become MiB and GiB.

Change the output of Tools/iobench/iobench.py.

Round also the size of the documentation from 5.5 MB to 5 MiB.

@vstinner
Copy link
Member Author

vstinner commented Nov 6, 2017

@@ -234,8 +234,8 @@ def link(self, target_desc, objects, output_filename, output_dir=None,
# who wants symbols and a many times larger output file
# should explicitly switch the debug mode on
# otherwise we let dllwrap/ld strip the output file
# (On my machine: 10KB < stripped_file < ??100KB
# unstripped_file = stripped_file + XXX KB
# (On my machine: 10KiB < stripped_file < ??100KiB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the difference between KB and KiB is important here.

@@ -29,9 +29,9 @@ def text_open(fn, mode, encoding=None):
return open(fn, mode)

def get_file_sizes():
for s in ['20 KB', '400 KB', '10 MB']:
for s in ['20 KiB', '400 KiB', '10 MiB']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be kept compatible with both Python 2.6 and Python >= 3.0.

Changing the output format will make harder comparing results between versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is compatible with Python 2.7 (I didn't test Python 2.6).

If the output must be the same with the script in Python 2.7, I can update Python 2.7 as well.

@pitrou: Since you wrote the script, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem from me.

@@ -907,7 +907,7 @@ def detect_modules(self):
missing.append('_hashlib')

# We always compile these even when OpenSSL is available (issue #14693).
# It's harmless and the object code is tiny (40-50 KB per module,
# It's harmless and the object code is tiny (40-50 KiB per module,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is approximate value.

@serhiy-storchaka
Copy link
Member

It may be worth to correct also MB and GB (as in f8def28). But not all decimal prefixes can be changed.

@vstinner
Copy link
Member Author

vstinner commented Nov 6, 2017

It may be worth to correct also MB and GB (as in f8def28).

Done.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where a space between digits and unit is missed, like 256KB.

</tr>
<tr><td>EPUB</td>
<td><a href="{{ dlbase }}/python-{{ release }}-docs.epub">Download</a> (ca. 5.5 MB)</td>
<td><a href="{{ dlbase }}/python-{{ release }}-docs.epub">Download</a> (ca. 5.5 MiB)</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the size in MB and MiB is the same? Are binary prefixes preferenced here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite sure that the file size were compute with an unit of 1024 * 1024 bytes, MiB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of epub-file in 3.7 is 5.426 MB or 5.175 MiB. 5.5 is closer to 5.426, but in any case this is an approximate estimation. The question is what unit is more human-readable. What units are used by a file manager in Windows and OS X?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What units are used by a file manager in Windows and OS X?

https://blogs.msdn.microsoft.com/oldnewthing/20090611-00/?p=17933

"Explorer is just following existing practice. Everybody (to within experimental error) refers to 1024 bytes as a kilobyte, not a kibibyte."

I prefer KiB rather than KB since KiB is defined as 1024 bytes, whereas KB (or kB) may be 1000 or 1024 bytes.

The size of epub-file in 3.7 is 5.426 MB or 5.175 MiB. 5.5 is closer to 5.426, but in any case this is an approximate estimation.

I don't think that it's interesting to give the file size. If we want to keep it, maybe we should drop digits after the dot.

These numbers are likely outdated, it's very hard to keep them up to date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Windows Explorer uses the multiplier 1024 but call it KB, it is good to use binary prefixes here. If it uses the multiplier 1000, perhaps it is better to use decimal prefixes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it uses the multiplier 1000, perhaps it is better to use decimal prefixes.

I'm not aware of any application using decimal prefixes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume of hard disks usually is specified with decimal prefixes.

I just have tested, Windows uses binary multipliers (but decimal prefixes), thus it is better to use binary multipliers for getting the same numbers, but with correct binary prefixes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard drive manufacturers use decimal prefixes, all software I know of uses binary. That's most of the cause behind a 3 TB drive appearing as 2.7 GB to Windows/Linux.

That said, unless you actually care about byte-precision, 5.5 MB is mentally equivalent to 5.5 MiB, or 6 MB, or 5 MB.

@serhiy-storchaka
Copy link
Member

Please look also on cases where unit is not separated by a space from a number.

@vstinner
Copy link
Member Author

vstinner commented Nov 8, 2017

@serhiy-storchaka: "Please look also on cases where unit is not separated by a space from a number."

Oh, I missed them. Right, there were many of them :-)

I rebased and completed my PR.

@vstinner
Copy link
Member Author

vstinner commented Nov 8, 2017

I found one place in Python where "kB" means 1000 B: the bz2 module. The compression level is the block size in number of 100,000 B. I replaced 100 kB with 100,000 B to prevent any risk of confusion.

@vstinner
Copy link
Member Author

vstinner commented Nov 8, 2017

I chose to write "100,000 B" since it's what I found in the official bzip2 documentation:

http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html#memory-management

The flags -1 through -9 specify the block size to be 100,000 bytes through 900,000 bytes (the default) respectively.

@serhiy-storchaka
Copy link
Member

But writing some numbers (like (2 << 25)) looks strange to me. I would write (1 << 26) or (64 << 20).

kB (*kilo* byte) unit means 1000 bytes, whereas KiB ("kibibyte")
means 1024 bytes. KB was misused: replace kB or KB with KiB when
appropriate.

Same change for MB and GB which become MiB and GiB.

Change the output of Tools/iobench/iobench.py.

Round also the size of the documentation from 5.5 MB to 5 MiB.
@vstinner
Copy link
Member Author

vstinner commented Nov 8, 2017

I replaced "900,000 B" with "900,000 bytes". "B" unit is rarely written like that, usually it's more "bytes".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants