Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures on s390x: endianness problems? #210

Open
CyrilBrulebois opened this issue Dec 10, 2022 · 11 comments
Open

Test failures on s390x: endianness problems? #210

CyrilBrulebois opened this issue Dec 10, 2022 · 11 comments

Comments

@CyrilBrulebois
Copy link

In addition to the hopefully easy 32-bit issue (#209), I'm wondering whether big endian systems might be entirely busted. That's hard to say for sure since s390x is the only one big endian system available on the https://ci.debian.net/ infrastructure…

I'll just quote one occurrence:

=== RUN   TestConfigureNAT
    nftables_test.go:313: message 2: got -- want
          02 00 00 00 -- 02 00 00 00
        ! 00 08 00 01 -- 08 00 01 00
          6e 61 74 00 -- 6e 61 74 00
        ! 00 08 00 02 -- 08 00 02 00
          00 00 00 00 -- 00 00 00 00
           -- 

In ! lines, see how byte pairs are swapped. There are many more occurrences of this issue.

Here's an apparently different issue, but maybe that's the same reason (with the length being misread due to swapped bytes):

=== RUN   TestGetRules
    nftables_test.go:831: message 0: got -- want
          02 00 00 00 -- 02 00 00 00
        ! 00 0b 00 01 -- 0b 00 01 00
          66 69 6c 74 -- 66 69 6c 74
          65 72 00 00 -- 65 72 00 00
        ! 00 0a 00 02 -- 0a 00 02 00
          69 6e 70 75 -- 69 6e 70 75
          74 00 00 00 -- 74 00 00 00
           -- 
    nftables_test.go:853: invalid attribute; length too short or too large
--- FAIL: TestGetRules (0.00s)

and another one (might also be due to swapped bytes):

=== RUN   TestListChains
    nftables_test.go:1550: error returned from TestDial unexpected header type: got unknown(0), want unknown(2563)
--- FAIL: TestListChains (0.00s)

There's a final issue that looks like bytes are getting stashed in the wrong direction, and not just swapped:

=== RUN   TestOtherTypes
    binaryutil_test.go:136: PutInt32 failure, expected: []byte{0x78, 0x56, 0x34, 0x12}, got: []byte{0x12, 0x34, 0x56, 0x78}
--- FAIL: TestOtherTypes (0.00s)
FAIL

The full log is available at: https://ci.debian.net/data/autopkgtest/testing/s390x/g/golang-github-google-nftables/29113914/log.gz and I'm attaching it for reference.
golang-github-google-nftables_s390x_29113914.log

@stapelberg
Copy link
Collaborator

I think the issue might be that the underlying mdlayher/netlink library is not big-endian compatible: mdlayher/netlink#201 (many tests are skipped on big-endian systems in that package).

@mdlayher
Copy link
Contributor

It is big endian compatible, but a lot of the tests use byte fixtures that are little endian. I don't have access to any big endian machines, but PRs are welcome.

@CyrilBrulebois
Copy link
Author

Thanks for the reassuring answer regarding the code's correctness. I've decided to go ahead and upload a package that allows the test suite to fail on big endian archs.

I'm not really a Go developer or a porter, so I can't really provide a pull request to fix the tests. However, I do have access to a s390x porter box (via Debian), and can test any commits or branches you'd like me to.

@stapelberg
Copy link
Collaborator

Thanks for having a look, @mdlayher. Please let us know here when a new netlink release is available and I can pull that in :)

@mdlayher
Copy link
Contributor

No worries! I will try to do that today.

@mdlayher
Copy link
Contributor

mdlayher commented Dec 12, 2022

I've released netlink v1.7.1 and genetlink v1.3.1 which fix all known endianness test failures. A lot of my tests hardcode fixed little endian byte dumps which isn't ideal but is useful for development.

stapelberg added a commit that referenced this issue Dec 12, 2022
@stapelberg
Copy link
Collaborator

Thanks, now pulled in!

@CyrilBrulebois Can you check which issues remain, if any, please?

@CyrilBrulebois
Copy link
Author

@stapelberg Sure thing, but as far as I understand:

  • netlink is BE compatible;
  • netlink's tests were not BE compatible; that's confirmed to be fixed in the last release.
  • nftables is believed to be BE compatible as well?
  • nftables's tests are not BE compatible; the updated dependency on netlink doesn't change that, so the failures are exactly the same as before (at least from glancing at them, I can do a more systematic review if you prefer).

@stapelberg
Copy link
Collaborator

I only had a brief look thus far, maybe I have misinterpreted what I was seeing.

I’m not sure if I get a chance to have a closer look over the next few weeks, so any help with this is appreciated.

turekt added a commit to turekt/nftables that referenced this issue Mar 13, 2023
@turekt
Copy link
Contributor

turekt commented Mar 13, 2023

Hi @CyrilBrulebois and @stapelberg,

I have created a draft PR #218 with a fix approach for one test. In the log file, I see that there are a lot of failed test cases and I can change them all. Unfortunately, as there are a lot of tests to change, this requires time and I would like to get this approach reviewed and tested against a BE system before proceeding with a complete refactor.

If you can, please let me know whether this fixed the test on a BE system and whether this approach makes sense. If you have ideas how to best tackle this issue (e.g. skip tests on BE systems? or something else?), feel free to let me know.

@stapelberg
Copy link
Collaborator

I don’t have time to test on s390x currently, so if @CyrilBrulebois could give the PR a shot, that’d be great! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants