Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InetAddresses.forString(String) parses non-ASCII decimals #5682

Closed
Marcono1234 opened this issue Aug 10, 2021 · 1 comment
Closed

InetAddresses.forString(String) parses non-ASCII decimals #5682

Marcono1234 opened this issue Aug 10, 2021 · 1 comment
Assignees
Labels

Comments

@Marcono1234
Copy link
Contributor

Marcono1234 commented Aug 10, 2021

The method InetAddresses.forString(String) parses non-ASCII decimals due to its usage of Character.digit(...).
For example:

// /127.0.0.1
System.out.println(InetAddresses.forString("127.0.0.\u0661"));
// /0:0:0:0:0:0:0:1
System.out.println(InetAddresses.forString("0:0:0:0:0:0:0:\u0661"));

While it appears there is no RFC which strictly specifies the syntax, related RFCs, such as RFC 3986 specifying the URI syntax, explicitly only allow ASCII 0-9 (respectively f for IPv6).
Additionally there is this draft which also only permits ASCII digits.

Note however, that Java's InetAddress.getByName(String) does support non-ASCII digits as well, though it is not clear whether that is intended or just an oversight.

(This issue might be relevant for #3417)

@eamonnmcmanus
Copy link
Member

I made the appropriate change and tried it out against the full range of Google's internal tests. I found that some Google projects had tests explicitly verifying that addresses like 127.0.0.1 work, where those are "fullwidth" digits. We can fix the code in question so that it sanitizes non-ASCII digits before calling InetAddresses.forString (or the various other InetAddresses methods), but it seems a bit risky. I think it would be better to document that these methods accept non-ASCII digits, and explain how to reject them if you don't want that (for example !CharMatcher.ascii().matchesAllOf(address). The fact that java.net.InetAddress and java.net.URL also accept non-ASCII digits does argue in favour of InetAddresses continuing to accept them.

I didn't see any test failures from requiring that the port number in HostAndPort be ASCII, and I think that change might be safer to make. (#5681)

copybara-service bot pushed a commit that referenced this issue Aug 12, 2021
In `HostAndPort.fromString`, we previously allowed any Unicode digit. But the RFCs generally call for ASCII digits specifically.

Document that `InetAddresses.forString` and related methods _do_ allow non-ASCII digits and explain how to check for their presence. We found that a number of projects were testing that these digits work, so changing that seems risky.

Fixes #5681.
Fixes #5682.

Thanks to @Marcono1234 for the bug report.

RELNOTES=`net`: Port numbers spelled with non-ASCII digits are no longer allowed in `HostAndPort.fromString`.
PiperOrigin-RevId: 389966160
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants