Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any non ascii support? #62

Open
ian4hu opened this issue Oct 26, 2016 · 7 comments
Open

Is there any non ascii support? #62

ian4hu opened this issue Oct 26, 2016 · 7 comments

Comments

@ian4hu
Copy link

ian4hu commented Oct 26, 2016

I have use Qrious like qrious.value='Greeting from 中国' but when I scanned it, there only the 'Greeting from ' left, the non-ascii charactor is lost.

Is there a way to keep the non-ascii?

@neocotic
Copy link
Owner

I'm not sure how to implement this but I'd love to! I'm open to any help that can be offered to see if we can support such characters or if it is even possible within the QR code specs.

@dsh0416
Copy link

dsh0416 commented Mar 19, 2017

ISO-8859-1 specification supports UTF-8 encoding by using the byte-mode. The qrcode.js project supports UTF-8 mode. I would take some time on making it work in this library.

@nayuki
Copy link

nayuki commented Aug 27, 2018

My QR Code generator library has full Unicode support, including emoji / astral planes. Have a look at qrcodegen.js#L881.

@wolfiesonfire
Copy link

Convert text first, solved my problem.

function utf16to8(str) {  
    var out, i, len, c;  
    out = "";  
    len = str.length;  
    for(i = 0; i < len; i++) {  
    c = str.charCodeAt(i);  
    if ((c >= 0x0001) && (c <= 0x007F)) {  
        out += str.charAt(i);  
    } else if (c > 0x07FF) {  
        out += String.fromCharCode(0xE0 | ((c >> 12) & 0x0F));  
        out += String.fromCharCode(0x80 | ((c >>  6) & 0x3F));  
        out += String.fromCharCode(0x80 | ((c >>  0) & 0x3F));  
    } else {  
        out += String.fromCharCode(0xC0 | ((c >>  6) & 0x1F));  
        out += String.fromCharCode(0x80 | ((c >>  0) & 0x3F));  
    }  
    }  
    return out;  
}  

@nayuki
Copy link

nayuki commented Nov 29, 2018

@wolfiesonfire Your code doesn't translate UTF-16 surrogate pairs into proper UTF-8 sequences. So any characters above U+FFFF will be broken. Also, your code needlessly escapes U+0000, resulting in illegal over-long UTF-8.

@wolfiesonfire
Copy link

@nayuki Thanks for explain, i see the problem now.

@se-ti
Copy link

se-ti commented Oct 5, 2022

correct code, that supports cyrillic & emojy would look like this:

function utf32to8 (str) {
	var out, i, len, c;
	out = "";
	len = str.length;
	var codeAt = str.codePointAt || str.charCodeAt;    // IE 11 doesn't have codePointAt
	for (i = 0; i < len; i++) {
		c = codeAt.call(str, i);
		if (c >= 0x10000) {
			out += String.fromCharCode(0xF0 | ((c >> 18) & 0x07));
			out += String.fromCharCode(0x80 | ((c >> 12) & 0x3F));
			out += String.fromCharCode(0x80 | ((c >>  6) & 0x3F));
			out += String.fromCharCode(0x80 | ((c >>  0) & 0x3F));
		}
		else if (c >= 0x0800) {
			out += String.fromCharCode(0xE0 | ((c >> 12) & 0x0F));
			out += String.fromCharCode(0x80 | ((c >>  6) & 0x3F));
			out += String.fromCharCode(0x80 | ((c >>  0) & 0x3F));
		}
		else if (c >= 0x0080) {
			out += String.fromCharCode(0xC0 | ((c >> 6) & 0x1F));
			out += String.fromCharCode(0x80 | ((c >> 0) & 0x3F));
		}
		else
			out += str.charAt(i);


		if (str.charCodeAt(i) != c)
			i++;
	}

	return out;
}

P.S. issue with IE fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants