Bug 78584 - [Win32] Unicode character suppport for the command line arguments
Summary: [Win32] Unicode character suppport for the command line arguments
Status: NEW
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Runtime (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Pascal Rapicault CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-11-14 18:12 EST by Hiroyuki Okamoto CLA
Modified: 2017-03-09 05:53 EST (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hiroyuki Okamoto CLA 2004-11-14 18:12:20 EST
On Windows, 
The user can not specify any Unicode characters not existing in OS native 
charset in the command line arguments, and args paramters in 
org.eclipse.core.launcher.Main#main() method receive '?' characters, if the 
command line arguments contain any non-OS native characters.

For example, 
in English system locale, the user cannot create the data directory with any 
non-Latin1 characters (such as Hiragana, Katakana, Hungle, Russian, Indian, or 
CJK characters, so on)

This is because java.exe (in most of JDKs/JREs) is not an Unicode application 
on Win32, and main( ) in java.c in java.exe can receive only the OS native 
characters, can not receive any other Unicode characters, even if the Eclipse 
launcher is compiled as Unicode application.

So, here is my idea to solve this problem:

1. eclipse.exe encodes all the non-ASCII chars (that is, all the characters 
bigger than \u007f) in the command line arguments to Unicode escape chars 
(\uXXXX)

2. org.eclipse.core.launcher.Main#main() decodes the Unicode escape chars to 
Java String, and set them in args in main() again.
Comment 1 Jeff McAffer CLA 2004-11-14 20:29:42 EST
moving to SWT (home or eclipse.exe) for initial comment and to see if
- there is another way
- this is an issues on all platforms
- this approach works on all platforms
Comment 2 Steve Northover CLA 2004-11-15 13:33:09 EST
Just a question: If the Unicode characters don't exist in the native charset 
for the operating system, then how are they typed on the command line and how 
would a C program create a directory that contained one of these characters?  
Is it true that were the characters to be embedded in a shell script, a native 
text editor for the platform could not see them properly?  Jeff talked about 
Windows only.  The other platforms need to be looked into.

Chrix to investigate.
Comment 3 Christophe Cornu CLA 2004-11-16 09:32:14 EST
Hiroyuki: without using Eclipse, do you know how to create a folder that 
contains Hiragana/Katakana characters on a system with english locale? How 
would you do it (with the windows explorer, dos console?,...)?


Comment 4 Hiroyuki Okamoto CLA 2004-11-16 09:58:03 EST
On Windows, 
we can specify the Unicode arguments on:

- "Run" dialog (Start -> Run...)
- Application Shortcut
- also, CreateProcessW or _wexec* can specify the Unicode arguments to invoke 
eclipse.exe.

Especially this is important to implement the RCP application as a generic 
application, I think.

On Unix platforms,
I don't think this is not nessesary.
because we can specify UTF-8 in LANG environment variable. (eg. en_US.UTF-8)
Comment 5 Hiroyuki Okamoto CLA 2004-11-16 09:59:13 EST
Sorry.
> I don't think this is not nessesary.

I don't think this is nessesary.

Comment 6 Udo Walker CLA 2015-09-17 09:23:05 EDT
I think this works already as I created a workspace directory with Thai characters in the name.

Used Eclipse version: Mars on Windows 10 (running in German language)

I started Eclipse with suggested workspace directory. In Eclipse I siwtched worspace to a new one. For the name of this new one I used Thai characters. 

Here the name: eclipseะีัะอี

This is also working from command line with argument "-data d:\eclipseะีัะอี".