Bug 236513 - bidi: english characters swapped only in linux
Summary: bidi: english characters swapped only in linux
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: SWT (show other bugs)
Version: 3.4   Edit
Hardware: PC Linux-GTK
: P3 major (vote)
Target Milestone: 3.5 M4   Edit
Assignee: Felipe Heidrich CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 225315 233298 233478 233488 233732 233733 234760 234863 234865 234922 234939 235126 235134 235234 235715 235871 235873 236372 236970 (view as bug list)
Depends on:
Blocks: 233732
  Show dependency tree
 
Reported: 2008-06-10 16:52 EDT by Matthew Mazaika CLA
Modified: 2008-12-11 22:44 EST (History)
12 users (show)

See Also:


Attachments
transformations -> compiler options linux-windows comparision (146.03 KB, image/png)
2008-06-10 16:52 EDT, Matthew Mazaika CLA
no flags Details
new datasource - linux-windows comparision (65.99 KB, image/png)
2008-06-10 16:53 EDT, Matthew Mazaika CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Mazaika CLA 2008-06-10 16:52:21 EDT
Created attachment 104390 [details]
transformations -> compiler options linux-windows comparision

Somehow, when there are English characters on both ends of a string, the two sets of characters are swapped.  This swapping only occurs on linux.  The same strings do not swap on windows.

We are running RHEL 5.1 with pango-1.16.4-1.fc7



A few places where this occurs are as follows:

1) window -> preferences -> jet transformations -> compiler options
2) database development perspective -> right click "ODA Data Source" and select "New..."

also, most likely impacting most of the following bugs as well:
3) bug 234736
4) bug 234735
5) bug 233231
6) bug 234760
7) bug 234759
8) bug 233478
9) bug 233488
10) bug 233298
11) bug 233326
12) bug 233342
13) bug 233519
14) bug 233537
15) bug 233733
16) bug 234268
17) bug 234743
18) bug 234742
19) bug 234863
20) bug 234865
21) bug 234939
22) bug 234922
23) bug 234920
24) bug 234954
25) bug 235126
26) bug 235134
27) bug 235234
28) bug 235328
29) bug 235321
30) bug 235318
31) bug 235327
32) bug 235345
33) bug 235347
34) bug 235715
35) bug 235870
36) bug 235873
37) bug 235871
Comment 1 Matthew Mazaika CLA 2008-06-10 16:53:03 EDT
Created attachment 104391 [details]
new datasource - linux-windows comparision
Comment 2 Felipe Heidrich CLA 2008-06-11 11:49:20 EDT
This happen because the reading order in GTK is defined by the direction of first strong character in the text.
I didn't visit all the bugs in the list, but I feel safe to say that this problem happens when the text start by an English word (which cause the default paragraph level to be LTR).

On windows the paragraph level of a text widget is defined by style (SWT.RIGHT_TO_LEFT or SWT.LEFT_TO_RIGHT).
Comment 3 Felipe Heidrich CLA 2008-06-11 12:08:45 EDT
See http://bugzilla.gnome.org/show_bug.cgi?id=167746
Comment 4 Kit Lo CLA 2008-06-12 00:07:07 EDT
*** Bug 235871 has been marked as a duplicate of this bug. ***
Comment 5 Kit Lo CLA 2008-06-12 10:03:52 EDT
Felipe, this is a major problem! This problem is causing many on the Arabic translations to be unreadable or to give out "incorrect" information/instructions! This is affecting the usability of Eclipse in Bidi mode.

I'm going to mark all the bugs in the list above as dups of this bug.

Please raise the priority of this bug.

Thanks!
Comment 6 David Dykstal CLA 2008-06-12 14:00:47 EDT
*** Bug 235627 has been marked as a duplicate of this bug. ***
Comment 7 Felipe Heidrich CLA 2008-06-12 14:05:54 EDT
(In reply to comment #5)
> I'm going to mark all the bugs in the list above as dups of this bug.
That is not correct.

I see some of the bugs and they are not dups of this.
please make sure the problem is a duplicate before you close it as so.
Comment 8 CDE Administration CLA 2008-06-12 14:10:07 EDT
*** Bug 234736 has been marked as a duplicate of this bug. ***
Comment 9 CDE Administration CLA 2008-06-12 14:10:24 EDT
*** Bug 233231 has been marked as a duplicate of this bug. ***
Comment 10 CDE Administration CLA 2008-06-12 14:10:43 EDT
*** Bug 234760 has been marked as a duplicate of this bug. ***
Comment 11 CDE Administration CLA 2008-06-12 14:11:02 EDT
*** Bug 234759 has been marked as a duplicate of this bug. ***
Comment 12 CDE Administration CLA 2008-06-12 14:11:20 EDT
*** Bug 233478 has been marked as a duplicate of this bug. ***
Comment 13 Kit Lo CLA 2008-06-12 14:19:16 EDT
We verified that all those bugs in the list only happen on Linux, and they are
not related to the pango library level. We cannot explain why they are
displaying okay on Windows but not on Linux. So, we assume that they are
related to this problem.

Are there any particular ones that you don't think they are dups of this bug?
If you could give us some examples, we will investiage. Thanks!
Comment 14 Felipe Heidrich CLA 2008-06-12 14:27:09 EDT
*** Bug 235134 has been marked as a duplicate of this bug. ***
Comment 15 CDE Administration CLA 2008-06-12 14:39:44 EDT
*** Bug 233231 has been marked as a duplicate of this bug. ***
Comment 16 CDE Administration CLA 2008-06-12 15:18:40 EDT
*** Bug 233478 has been marked as a duplicate of this bug. ***
Comment 17 CDE Administration CLA 2008-06-12 16:12:17 EDT
*** Bug 233488 has been marked as a duplicate of this bug. ***
Comment 18 CDE Administration CLA 2008-06-12 16:12:46 EDT
*** Bug 233298 has been marked as a duplicate of this bug. ***
Comment 19 CDE Administration CLA 2008-06-12 16:13:14 EDT
*** Bug 233519 has been marked as a duplicate of this bug. ***
Comment 20 CDE Administration CLA 2008-06-12 16:14:01 EDT
*** Bug 233733 has been marked as a duplicate of this bug. ***
Comment 21 CDE Administration CLA 2008-06-12 16:14:48 EDT
*** Bug 234268 has been marked as a duplicate of this bug. ***
Comment 22 CDE Administration CLA 2008-06-12 16:16:17 EDT
*** Bug 234743 has been marked as a duplicate of this bug. ***
Comment 23 CDE Administration CLA 2008-06-12 16:17:02 EDT
*** Bug 234863 has been marked as a duplicate of this bug. ***
Comment 24 CDE Administration CLA 2008-06-12 16:18:06 EDT
*** Bug 234865 has been marked as a duplicate of this bug. ***
Comment 25 CDE Administration CLA 2008-06-12 16:18:36 EDT
*** Bug 234939 has been marked as a duplicate of this bug. ***
Comment 26 CDE Administration CLA 2008-06-12 16:19:08 EDT
*** Bug 234922 has been marked as a duplicate of this bug. ***
Comment 27 CDE Administration CLA 2008-06-12 16:21:39 EDT
*** Bug 235126 has been marked as a duplicate of this bug. ***
Comment 28 Felipe Heidrich CLA 2008-06-12 16:22:52 EDT
It is a lot bugs to go thru but at this point I see two different problems:

1) this one, words in the wrong order.
- Only happens on GTK
- Only happens when the text start with an English word

This a platform behaviour, happens all native controls, labels, buttons, etc.


2) Bug 235318, reversed parentheses 
- Happens on win32 and GTK
It manifest a bit different in win32 and GTK. 
In GTK this text will 'wrong':
ARABenglish(test)HEBREW   -> display   WERBEH(english(testBARA

In win32, if RTL is set these will be wrong:
english(test)HEBREW       -> display   WERBEH(english(test
In win32, if LTR is set these will be wrong:
HEBREW(ARAB)english       -> display   BARA)WERBEH)english

This is the expect behaviour of the bidi algorithm.
You need to insert LRE+PDF to embed english text in an arab document and
RLE+PDF to embed arabic text in an english document.
I think this can be done for you by the TextProcessor API (i'm not familiar
with this API).

=-=-=-=-=-=-=-=-=-

Please use this description to close duplicate bugs correctly. What is word in
the wrong order mark as dup of this. What is parentheses you don't touch, the
owner of the code is responsible by calling the right API to insert bidi
control characters.
Comment 29 CDE Administration CLA 2008-06-12 16:22:54 EDT
*** Bug 235234 has been marked as a duplicate of this bug. ***
Comment 30 CDE Administration CLA 2008-06-12 16:23:48 EDT
*** Bug 235328 has been marked as a duplicate of this bug. ***
Comment 31 CDE Administration CLA 2008-06-12 16:24:11 EDT
*** Bug 235715 has been marked as a duplicate of this bug. ***
Comment 32 CDE Administration CLA 2008-06-12 16:24:35 EDT
*** Bug 235870 has been marked as a duplicate of this bug. ***
Comment 33 CDE Administration CLA 2008-06-12 16:25:59 EDT
*** Bug 235873 has been marked as a duplicate of this bug. ***
Comment 34 Felipe Heidrich CLA 2008-06-12 18:04:15 EDT
*** Bug 236970 has been marked as a duplicate of this bug. ***
Comment 35 Felipe Heidrich CLA 2008-06-12 18:15:19 EDT
Okay, I visited each problem report in listed in here.
<rant>Please, at least try to understand the problem a bit before closing it as duplicate</rant>

Here my comclusions:
REOPEN	Bug 233231 TVT34:TCT423: The closing parenthesis is reversed and misplaced
reverse paranthesis problem
 
DUP	Bug 234760 TVT34:TCT449: Two strings are (L2R) direction. They sould be (R2L)

NEW	Bug 234759 TVT34:TCT462: The "English" words should come in the end of the string.
more info needed, this might be another dup of 234759

DUP 	Bug 233478 TVT34:TCT479: The string direction is reversed "BiDi", another sting is truncated
Note, no problem report captures the second problem (string is truncated)

DUP 	Bug 235134 TVT34:TCT682: The English text should be on the right side

REOPEN	Bug 233488 TVT34:TCT483: The string direction is reversed.
more info needed

DUP	Bug 233298 TVT34:TCT492: All the variables lables are on wrong the left side

REOPEN	Bug 233519 TVT34:TCT519: SQB: The closing parentheses after the English strings are reversed
reverse paranthesis problem

DUP	Bug 233733 TVT34:TCT550: The English words is misplaced in the Arabic line

REOPEN 	Bug 234268 TVT34:TCT574: The panel is left aligned

REOPEN 	Bug 234743 TVT34:TCT587: The Arabic text should be at the right side. 
more info needed

DUP  	Bug 234863 TVT34:TCT648: The English string should be on the right side.
DUP 	Bug 234865 TVT34:TCT652: The English string should be on the right side.
DUP 	Bug 234939 TVT34:TCT653: The English string should be on the right side.
DUP 	Bug 234922 TVT34:TCT656: The English string should be on the right side.
DUP 	Bug 235126 TVT34:TCT677: The English string should be on the right side.
These are all the same problem, note they all have problem with reordering and alignment

NEW	Bug 235234 TVT34:TCT699: The English text of all the descriptions should come on the right side
More info needed, this might be another dup of 234759
 
REOPEN 	Bug 235328 TVT34:TCT704: The parentheses are reversed, and the menu items are not translated
reverse paranthesis problem

DUP	Bug 235715 TVT34:TCT742: The English text should be on the right side of the string

DUP	Bug 235870 TVT34:TCT744: The English text is missed up with the Arabic text
more info needed, it might not be a dup of 236513

DUP	 Bug 235873 TVT34:TCT746: The English text should be on the right side
Comment 36 Kit Lo CLA 2008-06-12 21:47:37 EDT
Thanks Felipe! We will try to collect any additional info needed for debugging.

We did do our investigation before opening this bug. We tried the 30+ problems at different levels of Pango. We tried the problems on RHEL, SUSE Linux, and Windows. We could only conclude that all the scenarios displayed okay on Windows but failed on Linux. Before your explanation, we really have no idea that the problems could be categorized into different behaviors. We may have unintentionally marked a few bugs as a dup by mistake.

Do you have any idea of the fix in mind? What's the target milestone for fixing this? In this release? In the maintenance release? Or probably in next release?

Thanks!
Comment 37 Felipe Heidrich CLA 2008-06-13 11:31:34 EDT
(In reply to comment #36)
> Do you have any idea of the fix in mind? What's the target milestone for fixing
> this? In this release? In the maintenance release? Or probably in next release?

A few things: 
a) Wait for the gtk developers to add API to fix this problem. 
b) insert LRM when a string start with arabic on a LTR controls and
   insert RLM when a string start with english on a RTL controls.
This can be done by your TextProcessor API.
If you agree with that I can close this problem as wont fix (platform limitation).
If not, I'll look in what it takes to do the same in my end.

We should be able to fix this for next release, we can probably put it in for the next maintenance release (given that fix is not too complex/unsafe).
Comment 38 Felipe Heidrich CLA 2008-06-13 11:40:41 EDT
*** Bug 236372 has been marked as a duplicate of this bug. ***
Comment 39 Felipe Heidrich CLA 2008-06-13 16:39:53 EDT
*** Bug 233488 has been marked as a duplicate of this bug. ***
Comment 40 Felipe Heidrich CLA 2008-06-16 10:53:38 EDT
*** Bug 235234 has been marked as a duplicate of this bug. ***
Comment 41 Xiaoxiao Wu CLA 2008-08-26 05:15:12 EDT
*** Bug 233732 has been marked as a duplicate of this bug. ***
Comment 42 Boris Bokowski CLA 2008-08-27 11:57:54 EDT
*** Bug 225315 has been marked as a duplicate of this bug. ***
Comment 43 Boris Bokowski CLA 2008-08-28 15:04:20 EDT
*** Bug 240773 has been marked as a duplicate of this bug. ***
Comment 44 Raji Akella CLA 2008-10-22 13:59:15 EDT
Can we get an eta on this? We reported 227558 which was closed as Dup of 225315 which in turn was dup'd to this bug.

Comment 45 Mike Wilson CLA 2008-10-27 10:20:18 EDT
Felipe,
   I'm trying to understand this bug... Are you saying that on Linux the SWT.RIGHT_TO_LEFT  and SWT.LEFT_TO_RIGHT flags are ignored? This seems like a pretty serious problem to me. I realize there is a platform limitation involved, but this feels like something we would normally look for a workaround for. Ist that not possible in this case? Could we always reserve the first character of the control for a direction marker and hack the rest of the API to prevent it from being removed / modified?

Comment 46 Felipe Heidrich CLA 2008-10-27 16:26:28 EDT
Okay, let go by parts.

1. Out of > 37 bugs that were marked as duplicated of this problem only 2 (maybe 3) were in fact duplicates.

2. Raji, 227558 (and 225315) are in fact duplicate of this problem. But this problem only happens when running Right-to-Left with English strings. If you use an Arabic or Hebrew catalog you won't have this problem.

3. This problem is not in the 3.5 Plan right now. NO ETA.

4. Mcq, we do not ignore SWT.RIGHT-TO-LEFT in GTK. The difference is that in GTK the style flag doesn't not determine the default text reading order (which is the case in Win32 controls). In GTK, the text reading order is defined by the first strong character in the paragraph.

Note: this is platform behaviour.
Note: Both, win32 and gtk are in accordance with the bidi algorithm as defined by Unicode in http://www.unicode.org/reports/tr9/


This problem can be fixed in a number of ways.

A) In SWT, as McQ suggested, we can override GTK by inserting bidi controls in the text. We'll have to change every place we set/get a string to/from the OS, fix the text offsets too. Hard to do for editable controls.

B) Change TextProcessor to handle this case.

C) Fix the translation, the catalogs per say. Sometimes this is only place able to fix reordering (Bug 235328)


D) Wait for GTK to fix it for us, see comment 3
Comment 47 Felipe Heidrich CLA 2008-11-18 12:33:46 EST
Fixed in HEAD > 20081118

The fix is not the list (comment 46)
Steve found a way to override the constructor for GObject, so we did that for PangoLayout and change the auto_dir property to false.

Right now PangoLayout uses the text direction set in the PangoContext, and the PangoContext uses the text direction set in the widget. That is exactly what we want.

Except GtkEntry and GtkTextView, these two widget overide the direction in the PangoContext. This is how they determine the direction for the pangocontent:
1st. direction of text
2nd. direction of the keyboard keymap
3rd. direction of the widget
4th. deault widget direction

This meant the bug still exist in Text and Combo, please open a new problem report for them or use bug 230854.
Comment 48 Felipe Heidrich CLA 2008-11-18 14:49:56 EST
We think that the changes are too dangerous for a x.x.x release.  We are expecting that the more extensive testing for Eclipse 3.5 will uncover problems.