Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-c works with -x but not -l #689

Open
Jimmy-Z opened this issue Apr 5, 2023 · 2 comments
Open

-c works with -x but not -l #689

Jimmy-Z opened this issue Apr 5, 2023 · 2 comments
Labels
help wanted Need outside help

Comments

@Jimmy-Z
Copy link

Jimmy-Z commented Apr 5, 2023

I got a zip file in CP936/GBK, -x -c 936 is able to extract the file correctly, but:

  • file names in extraction log are garbled
  • -l -c 936 also gave the same garbled file names.

Pipe to iconv likeminizip -l a.zip | iconv -f gbk -t utf8 works.

It seems `-c' doesn't affect -l in any way.

@pmqs
Copy link
Contributor

pmqs commented Apr 6, 2023

Tested the attached zip file, folder.zip on my Ubuntu setup. Running a fresh minizip

$ minizip -h  
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng

The zip file contains the following

$ unzip -l -O cp936 folder.zip 
Archive:  folder.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2019-07-09 13:27   folder/
        0  2019-07-09 13:26   folder/新建文本文档.txt
        0  2019-07-09 13:27   folder/新建文档.docx
---------                     -------
        0                     3 files

First try listing its contents with minizip

$ minizip -c 936 -l  folder.zip 
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng
---------------------------------------------------
-c -l folder.zip 
      Packed     Unpacked Ratio Method   Attribs Date     Time  CRC-32     Name
      ------     -------- ----- ------   ------- ----     ----  ------     ----
           0            0    0% stored        10 07-09-19 06:27 00000000   folder/
           0            0    0% stored        20 07-09-19 06:26 00000000   folder/�½��ı��ĵ�.txt
           0            0    0% stored        20 07-09-19 06:27 00000000   folder/�½��ĵ�.docx

I see the same encoding issue. Now use minizip to extract the contents of the zip file

$ minizip -c 936 -x  folder.zip 
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng
---------------------------------------------------
-c -x folder.zip 
Archive folder.zip
Extracting folder/
Extracting folder/�½��ı��ĵ�.txt
Extracting folder/�½��ĵ�.docx

Note the encoding issue with the Extracting... lines

Check what was written to disk.

$ ls -l folder
total 0
-rw-rw-rw- 1 paul paul 0 Jul  9  2019 新建文本文档.txt
-rw-rw-rw- 1 paul paul 0 Jul  9  2019 新建文档.docx

That looks fine.

Looks like there are (at least) two places where the code isn't doing what is expected when the -c option is specified.

After a brief look at the code I see that mz_os_utf8_string_create is used to do the UTF8 encoding on the filename. That function is only called from mz_zip_reader_save_all which is part f the extract workflow.

@nmoinvaz
Copy link
Member

nmoinvaz commented Apr 8, 2023

If would be helpful if you can submit a PR.

@nmoinvaz nmoinvaz added the help wanted Need outside help label May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Need outside help
Projects
None yet
Development

No branches or pull requests

3 participants