Bug 287936 - [Import/Export] Import project archive (zip) does not work with umlauts / unicode
Summary: [Import/Export] Import project archive (zip) does not work with umlauts / uni...
Status: NEW
Alias: None
Product: Platform
Classification: Eclipse Project
Component: IDE (show other bugs)
Version: 3.4.1   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform UI Triaged CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-28 05:37 EDT by Martin Domig CLA
Modified: 2019-09-06 16:06 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Domig CLA 2009-08-28 05:37:30 EDT
Build ID: M20080911-1700

Steps To Reproduce:
1. Create a project with files/folders in it that contain umlauts/unicode characters in the name
2. Use the WinXP zip (send to -> zip file) to create a zip archive of that project from your workspace
3. Import that zip file into your workspace using the "Import Existing Projects into Workspace" wizard

Result: Files with umlauts in the name are mangled. Folders with umlauts in the name are EMPTY.

More information:
The problem seems to be the Sun zip implementation (java.util.zip), which does not work correctly with anything other than ASCII <127 characters. Probably the most viable solution would be to use something different than java.util.zip.

See also http://www.velocityreviews.com/forums/t147286-javautilzip-not-handling-unicode-filenames.html

When exporting the archive using the export wizards, the export works correctly (at least as long as everything is done on the same system). However, eclipse should be able to import zip files created by 3rd party applications.
Comment 1 Paul Webster CLA 2009-08-31 08:31:04 EDT
(In reply to comment #0)
> Build ID: M20080911-1700
> 
> Steps To Reproduce:
> 1. Create a project with files/folders in it that contain umlauts/unicode
> characters in the name

What happens when you create the files/folders in eclipse and then Export to an archive and import from an archive.  Do we mangle them in that case as well?

PW
Comment 2 Martin Domig CLA 2009-08-31 08:44:42 EDT
No, as long as you don't use anything but the wizards provided you're fine.
Comment 3 Paul Webster CLA 2009-08-31 08:54:57 EDT
This is most likely a dup of bug 75184

See bug 272126 comment #18

PW
Comment 4 Martin Domig CLA 2009-08-31 10:12:32 EDT
(In reply to comment #3)

Yes, looks like a dup. I did not find these when I searched the database.

But, to clarify the apparent cause of the problem:

java.util.zip uses a non-standard encoding for file names in the zip archives, if the file names contain unicode characters. It uses an encoding that nothing else can generate or understand, but the compression algo is the same.
If you use unicode characters file names, this renders java.util.zip effectively incompatible to the rest of the world. For that reason I suggest NOT to use it to zip/unzip anything in eclipse, since the majority of people on this planet are native speakers of languages other than english.
Comment 5 Paul Webster CLA 2009-08-31 10:28:18 EDT
(In reply to comment #4)
> But, to clarify the apparent cause of the problem:
> 
> java.util.zip uses a non-standard encoding for file names in the zip archives,
> if the file names contain unicode characters. It uses an encoding that nothing
> else can generate or understand, but the compression algo is the same.
> If you use unicode characters file names, this renders java.util.zip
> effectively incompatible to the rest of the world. For that reason I suggest
> NOT to use it to zip/unzip anything in eclipse, since the majority of people on
> this planet are native speakers of languages other than english.

This is not likely to happen ... java.util.zip is what is available, and it seems from the blog you posted that Sun is not likely to fix their implementation, windows is not likely to fix theirs, and the linux one uses a slightly different algorithm than either (that is mostly compatible with Sun's).  I'm not sure what the Mac does.

The usecase here is: an OSS EPL compatible zip/unzip library that can handle windows, winzip, pkzip, sun zips, linux zip, and whatever the Mac offers.

AFAIK that's not available

PW
Comment 6 Paul Webster CLA 2009-08-31 10:30:33 EDT
If suggestions are available, here's how to contribute: http://wiki.eclipse.org/Platform_UI/How_to_Contribute

PW
Comment 7 Eclipse Webmaster CLA 2019-09-06 16:06:30 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.