-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32107 - Better merge of #4494 #4576
Conversation
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
…torchaka/cpython into test_uuid This merges the relevant bits of python#4572 and should be considered instead of that PR.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
@@ -626,13 +673,14 @@ def getnode(): | |||
getters = [_unix_getnode, _ifconfig_getnode, _ip_getnode, | |||
_arp_getnode, _lanscan_getnode, _netstat_getnode] | |||
|
|||
for getter in getters + [_random_getnode]: | |||
for getter in getters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that exceptions in _random_getnode()
no longer ignored. Ignoring all exceptions is not a good idea, but what exceptions can raise _random_getnode()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only possibility is an import error. Ultimately, random.getrandbits()
boils down to os.urandom()
which I think can block but AFAICT won't raise an exception. And it bothers me (as I think it bothers you) that all exceptions in all those getters are just swallowed. Here's my thinking: since I'm not sure of the best way to report any errors in the other getters, and since the last getter is _random_getnode()
, and since it's pretty unlikely that the function will experience an ImportError
, and if it does then I think it's better to report that than to silently swallow the exception and return None
from getnode()
, since that will lead to even more mysterious bugs, I think letting any rare exceptions in _random_getnode()
not be swallowed is the better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's LGTM. I just wanted to make sure that this is an intentional change.
Lib/test/test_uuid.py
Outdated
# `test_random_getnode()` method specifically. Another case is the | ||
# Travis-CI case, which apparently only has locally administered MAC | ||
# addresses. | ||
if not random and not os.getenv('TRAVIS'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If add this test then restore the network
parameter. check_node()
is called for addresses obtained from network cards, for random addresses, but it also is called in cases where it is not known (and is not important) whether the address is real or random.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, if you look at the original code, every call to check_node()
except the one in test_random_getnode()
called it with network=True
. If I do add it back, which I'm willing to do, then I propose that we call this argument multicast
, default it to False
, and make it a keyword-only argument. I will see how that looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I will change random
to local
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three tests in which check_node()
is called without network=True
. Only one of them explicitly tests _random_getnode()
. In the other twos we can't say what is the source of the address. We shouldn't test either 1<<40
nor 1<<41
bits in that cases.
Lib/test/test_uuid.py
Outdated
self.assertTrue(0 < node < (1 << 48), | ||
"%s is not an RFC 4122 node ID" % hex) | ||
|
||
@unittest.skipUnless(os.name == 'posix', 'requires Posix') | ||
@unittest.skipIf(os.getenv('TRAVIS'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't skip this test. It should test that corresponding method don't raise an exception, crash or hang. And returns a reasonable integer or None. Just skip additional checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I meant to remove those skipIf
s; I'll push an update for that.
* Restored the check for the multicast bit. * Remove some now-extraneous skipIfs.
Lib/test/test_uuid.py
Outdated
self.assertFalse(node & (1 << 41), '%012x' % node) | ||
is_multicast = (node & (1 << 40)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't test the multicast bit for _unix_getnode()
and _windll_getnode()
. Actually we should test it only for _random_getnode()
, because the other methods can return an address with this bit set or clear (your original issue).
Lib/test/test_uuid.py
Outdated
@@ -527,21 +529,22 @@ def check_node(self, node, requires=None, *, random=False): | |||
# `test_random_getnode()` method specifically. Another case is the | |||
# Travis-CI case, which apparently only has locally administered MAC | |||
# addresses. | |||
if not random and not os.getenv('TRAVIS'): | |||
if not local and not os.getenv('TRAVIS'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't test the multicast bit for _unix_getnode()
and _windll_getnode()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'm removing the multicast bit check from check_node()
since only the test for _random_getnode()
needs to check that.
* Rename `local` to `admin` and use this to control whether the local vs. universal admin bit is set. Generally we do want to test that, except for the random case, and the two cases where we don't know what the provenance of the MAC address is. * We only need to check the multicast bit in the random test.
Lib/test/test_uuid.py
Outdated
def check_node(self, node, requires=None, *, | ||
# Additional bitmask checks to perform. | ||
local=False, multicast=False): | ||
def check_node(self, node, requires=None, *, admin=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning of the admin
flag is less obvious than the meaning of the opposite network
flag. I expected this is something related to administrative privileges. Could you add an elaboration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, network
wasn't really descriptive either. Something like check_universally_administered_bit
would be more descriptive but prohibitively long. We could go back to local
or switch the parity to universal
. I'm open to other suggestions.
Lib/test/test_uuid.py
Outdated
self.check_node(node, 'unix') | ||
# Since we don't know the provenance of the MAC address, don't check | ||
# whether it is locally or universally administered. | ||
self.check_node(node, 'unix', admin=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not do non-guarantied tests by default, and do them only if this is specified explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I obviously mildly (+0) prefer it the other way around. Would your opinion change if we named the flag universal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is your check, I left it on to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose check_bit
as the parameter name, but didn't change the default value. I'm happy with this.
@serhiy-storchaka Thanks for the great review! |
https://bugs.python.org/issue32107