Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32107 - Better merge of #4494 #4576

Merged
merged 35 commits into from
Nov 27, 2017
Merged

bpo-32107 - Better merge of #4494 #4576

merged 35 commits into from
Nov 27, 2017

Conversation

warsaw
Copy link
Member

@warsaw warsaw commented Nov 26, 2017

warsaw and others added 30 commits November 21, 2017 12:00
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
While it is required that the multicast bit (1<<40) should be set in
random generated addresses, there is no requirement that it should be
cleared in addresses obtained from network cards.
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
…torchaka/cpython into test_uuid

This merges the relevant bits of python#4572 and should be considered instead of
that PR.
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
While it is required that the multicast bit (1<<40) should be set in
random generated addresses, there is no requirement that it should be
cleared in addresses obtained from network cards.
@warsaw warsaw changed the title Better merge of #4494 bpo-32107 - Better merge of #4494 Nov 26, 2017
@warsaw warsaw removed the skip news label Nov 26, 2017
@@ -626,13 +673,14 @@ def getnode():
getters = [_unix_getnode, _ifconfig_getnode, _ip_getnode,
_arp_getnode, _lanscan_getnode, _netstat_getnode]

for getter in getters + [_random_getnode]:
for getter in getters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that exceptions in _random_getnode() no longer ignored. Ignoring all exceptions is not a good idea, but what exceptions can raise _random_getnode()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only possibility is an import error. Ultimately, random.getrandbits() boils down to os.urandom() which I think can block but AFAICT won't raise an exception. And it bothers me (as I think it bothers you) that all exceptions in all those getters are just swallowed. Here's my thinking: since I'm not sure of the best way to report any errors in the other getters, and since the last getter is _random_getnode(), and since it's pretty unlikely that the function will experience an ImportError, and if it does then I think it's better to report that than to silently swallow the exception and return None from getnode(), since that will lead to even more mysterious bugs, I think letting any rare exceptions in _random_getnode() not be swallowed is the better choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's LGTM. I just wanted to make sure that this is an intentional change.

# `test_random_getnode()` method specifically. Another case is the
# Travis-CI case, which apparently only has locally administered MAC
# addresses.
if not random and not os.getenv('TRAVIS'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If add this test then restore the network parameter. check_node() is called for addresses obtained from network cards, for random addresses, but it also is called in cases where it is not known (and is not important) whether the address is real or random.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if you look at the original code, every call to check_node() except the one in test_random_getnode() called it with network=True. If I do add it back, which I'm willing to do, then I propose that we call this argument multicast, default it to False, and make it a keyword-only argument. I will see how that looks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I will change random to local.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three tests in which check_node() is called without network=True. Only one of them explicitly tests _random_getnode(). In the other twos we can't say what is the source of the address. We shouldn't test either 1<<40 nor 1<<41 bits in that cases.

self.assertTrue(0 < node < (1 << 48),
"%s is not an RFC 4122 node ID" % hex)

@unittest.skipUnless(os.name == 'posix', 'requires Posix')
@unittest.skipIf(os.getenv('TRAVIS'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't skip this test. It should test that corresponding method don't raise an exception, crash or hang. And returns a reasonable integer or None. Just skip additional checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I meant to remove those skipIfs; I'll push an update for that.

* Restored the check for the multicast bit.
* Remove some now-extraneous skipIfs.
self.assertFalse(node & (1 << 41), '%012x' % node)
is_multicast = (node & (1 << 40))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't test the multicast bit for _unix_getnode() and _windll_getnode(). Actually we should test it only for _random_getnode(), because the other methods can return an address with this bit set or clear (your original issue).

@@ -527,21 +529,22 @@ def check_node(self, node, requires=None, *, random=False):
# `test_random_getnode()` method specifically. Another case is the
# Travis-CI case, which apparently only has locally administered MAC
# addresses.
if not random and not os.getenv('TRAVIS'):
if not local and not os.getenv('TRAVIS'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't test the multicast bit for _unix_getnode() and _windll_getnode().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm removing the multicast bit check from check_node() since only the test for _random_getnode() needs to check that.

* Rename `local` to `admin` and use this to control whether the local
  vs. universal admin bit is set.  Generally we do want to test that, except
  for the random case, and the two cases where we don't know what the
  provenance of the MAC address is.

* We only need to check the multicast bit in the random test.
def check_node(self, node, requires=None, *,
# Additional bitmask checks to perform.
local=False, multicast=False):
def check_node(self, node, requires=None, *, admin=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of the admin flag is less obvious than the meaning of the opposite network flag. I expected this is something related to administrative privileges. Could you add an elaboration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, network wasn't really descriptive either. Something like check_universally_administered_bit would be more descriptive but prohibitively long. We could go back to local or switch the parity to universal. I'm open to other suggestions.

self.check_node(node, 'unix')
# Since we don't know the provenance of the MAC address, don't check
# whether it is locally or universally administered.
self.check_node(node, 'unix', admin=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not do non-guarantied tests by default, and do them only if this is specified explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I obviously mildly (+0) prefer it the other way around. Would your opinion change if we named the flag universal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is your check, I left it on to you.

Copy link
Member Author

@warsaw warsaw Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose check_bit as the parameter name, but didn't change the default value. I'm happy with this.

@warsaw warsaw merged commit 9522a21 into python:master Nov 27, 2017
@warsaw warsaw deleted the test_uuid_2 branch November 27, 2017 19:40
@warsaw
Copy link
Member Author

warsaw commented Nov 27, 2017

@serhiy-storchaka Thanks for the great review!

vstinner added a commit that referenced this pull request Nov 27, 2017
@warsaw warsaw restored the test_uuid_2 branch November 27, 2017 23:43
warsaw added a commit that referenced this pull request Nov 29, 2017
Remove a flakey test and rewrite another one for readability.
warsaw added a commit that referenced this pull request Nov 29, 2017
Remove a flakey test and rewrite another one for readability.
@warsaw warsaw deleted the test_uuid_2 branch November 29, 2017 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants