Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextDecoder should not output BOM #25315

Closed
joyeecheung opened this issue Jan 2, 2019 · 4 comments
Closed

TextDecoder should not output BOM #25315

joyeecheung opened this issue Jan 2, 2019 · 4 comments
Labels
confirmed-bug Issues with confirmed bugs. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs.

Comments

@joyeecheung
Copy link
Member

Found when porting encoding WPT into core.

Spec: https://encoding.spec.whatwg.org/#concept-td-serialize

If encoding is UTF-8, UTF-16BE, or UTF-16LE, and ignore BOM flag and BOM seen flag are unset, then:

  • If token is U+FEFF, then set BOM seen flag.
  • Otherwise, if token is not end-of-stream, then set BOM seen flag and append token to output.
  • Otherwise, return output.

Failing tests:

@joyeecheung joyeecheung added the encoding Issues and PRs related to the TextEncoder and TextDecoder APIs. label Jan 2, 2019
@joyeecheung
Copy link
Member Author

cc @nodejs/i18n

@joyeecheung
Copy link
Member Author

Sorry, pinged the wrong team @nodejs/intl

@joyeecheung joyeecheung added the confirmed-bug Issues with confirmed bugs. label Jan 2, 2019
@Xstoudi
Copy link
Contributor

Xstoudi commented Oct 24, 2019

PR merged, should be close.

@targos
Copy link
Member

targos commented Oct 24, 2019

PR was merged, but the bug still exists. Both failing tests are skipped:

"textdecoder-byte-order-marks.any.js": {
"fail": "Mismatching BOM should not be ignored"
},
"textdecoder-copy.any.js": {
"fail": "Should not have output BOM: https://encoding.spec.whatwg.org/#concept-td-serialize"
},

addaleax added a commit to addaleax/node that referenced this issue Nov 1, 2019
Do not accept the BOM if it comes from a different encoding, and
only discard the BOM after it has actually been read (including
when it is spread over multiple chunks in streaming mode).

Fixes: nodejs#25315
MylesBorins pushed a commit that referenced this issue Nov 17, 2019
Do not accept the BOM if it comes from a different encoding, and
only discard the BOM after it has actually been read (including
when it is spread over multiple chunks in streaming mode).

Fixes: #25315

PR-URL: #30132
Reviewed-By: Gus Caplan <[email protected]>
targos pushed a commit that referenced this issue Dec 1, 2019
Do not accept the BOM if it comes from a different encoding, and
only discard the BOM after it has actually been read (including
when it is spread over multiple chunks in streaming mode).

Fixes: #25315

PR-URL: #30132
Reviewed-By: Gus Caplan <[email protected]>
MylesBorins pushed a commit that referenced this issue Dec 17, 2019
Do not accept the BOM if it comes from a different encoding, and
only discard the BOM after it has actually been read (including
when it is spread over multiple chunks in streaming mode).

Fixes: #25315

PR-URL: #30132
Reviewed-By: Gus Caplan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants