Bug 337711 - Unicode Characters are not put out correctly on console
Summary: Unicode Characters are not put out correctly on console
Status: CLOSED DUPLICATE of bug 266658
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Debug (show other bugs)
Version: 4.1   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform-Debug-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-21 07:20 EST by Christian CLA
Modified: 2011-06-08 15:02 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christian CLA 2011-02-21 07:20:04 EST
Build Identifier: M20100909-0800

Sometimes unicode characters are replaced with some replacement characters.

public class SystemOutTest {

	static Random rand = new Random(0);
	/**
	 * @param args
	 */
	public static void main(String[] args) {
		for (int k = 0; k < 10; k++) {
			StringBuilder sb = new StringBuilder();
			for (int i = 0; i < 100; i++) {
				for (int j = 0; j < 70; j++) {
					sb.append(rand.nextFloat() < 0.1f ?'\u25A0': '\u25A1');
				}
				sb.append("\n");
			}
			System.out.println(sb);
		}
	}

}

Reproducible: Sometimes

Steps to Reproduce:
Compile and run the provided code from within eclipse
Most of the time, one can see a line that sticks out.
This means one character was replaced with 2 replacement boxes, therefore the line becomes longer and breaks the block.
Comment 1 Michael Rennie CLA 2011-02-22 15:46:06 EST
Have you set the console encoding properly? On Windows the default encoding will usually not properly display unicode characters.

You either have to set it on the Common tab using the console encoding options or use the VM argument (-Dfile.encoding=UTF-8 for example).
Comment 2 Christian CLA 2011-02-22 16:41:23 EST
The grand majority of the characters are shown correctly. So I assume settings there are correct.

(Although I don't know which serttings you mean)
Comment 3 Michael Rennie CLA 2011-02-23 11:24:49 EST
(In reply to comment #2)
> The grand majority of the characters are shown correctly. So I assume settings
> there are correct.
> 
> (Although I don't know which serttings you mean)

In the launch configuration you use to run the program there are two tabs where you can set encoding options:

1. The Common tab, there is an 'Encoding' option group
2. The Arguments tab, there is a text area called 'VM arguments' where you can enter the '-Dfile.encoding=UTF-8' vm argument
Comment 4 Christian CLA 2011-02-23 12:23:07 EST
Common was set to utf-8 (inheritted from workspace). No settings were done in the arguments tab.

Also I doubt a setting is possible that would make the majority of all characters come out correctly, but only sometimes fail at the very same characters.


After all the example only uses 2 characters  which were taken over from the program where this happened the first time.
The bug might already appear with a single character, but I haven't tried that.

I was able to reproduce this problem on two different windows 7 machines.
Comment 5 Pawel Piech CLA 2011-06-08 15:02:21 EDT

*** This bug has been marked as a duplicate of bug 266658 ***