Bug 229010 - BIDI: Enhance TextProcessor to support "Messages with placeholders" types of complex expressions
Summary: BIDI: Enhance TextProcessor to support "Messages with placeholders" types of ...
Status: ASSIGNED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Components (show other bugs)
Version: 3.4   Edit
Hardware: PC All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: equinox.components-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 466345 183164
Blocks: 222889 224921
  Show dependency tree
 
Reported: 2008-04-27 09:08 EDT by Tomer Mahlin CLA
Modified: 2015-05-21 08:14 EDT (History)
13 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tomer Mahlin CLA 2008-04-27 09:08:38 EDT
The context 
============

  In many cases the text displayed to the end user in Eclipse GUI is constructed from several tokens. 
  One of classical examples is a "Message including placeholders". In this case at development time a standard message is constructed and location of placeholders is identified. For example:
   This file {1} does not exist in your workspace: {2}
At run time the placeholders ({1} and {2} in the example above) are replaced with some meaningfull text and the final message is displayed to the end user. For example:
   This file c:\abc\file.txt does not exist in your workspace: c:\my_workspace
  Another classical example is constructed text. This is actually a simplified case of "Message including placeholders". In this case there is no actually any "message". Instead, the text can be seen as a series of placeholders replaced by meaningfull data at runtime. For example the list of Most Recently Used files in the main File menu is constructed from items having following syntax:
 <counter> <file name> [<file path>]
At runtime all placeholders (<counter>, <file name> and <file path>) are replaced with relevant text data. For example:
  1 myfile.txt [myworkspace/mylib]
  
The problem
============
 When either message or placeholder(s) or both include Bidi text the display of the entire message might become incorrect and thus incomprehensible. 
 This occurs due to the fact that direction of entire message and each placeholder might be different from each other. If this direction is not enforced, general UBA used for rendering the entire message can change relative positions of words or even break the integrity of single word.

 Why direction of placeholder can be different from the direction of entire message ? The answer is very simple: assuming the message is translated to Bidi language the natural direction for the message is RTL. However, if embedded into this message placeholder is replaced by file path the natural direction for the text replacing the placeholder is LTR.

  Please notice that the problem of preserving internal structure inside each placeholder (this is what process function in TextProcessor is doing) is from different (actually lower) level. 

Why TextProcessor does not solve the problem right now ?
===========================================================
  TextProcessor encloses input text between LRE and PDF control characters. This enforces LTR direction only (it does not handle RTL direction enforcement) and due to Windows OS limitations LTR direction is enforced only in very specific cases (i.e. when the text enclosed by LRE / PDF appears stand alone the direction is enforced, however, in general case when this text appears as part of a bigger message the LTR direction is not guranteed because  presentation engine used by Windows OS is not strictly conformant to the UBA.) 
  Moreover, TextProcessor can't possibly enforce RTL direction since it uses only LRE/PDF while for enforcing RTL direction RLE/PDF should be used.

Samples of known issues
========================

Case 1: Constructed text - defect 224921 - BIDI3.4:HCG_Incorrect display of history in File menu in mirrored mode 

  The list of Most Recently Used files in the main File menu is constructed from items having following syntax:
 <counter> <file name> [<file path>]
At runtime all placeholders (<counter>, <file name> and <file path>) are replaced with relevant text data. For example:
  1 myfile.txt [myworkspace/mylib]
  In the mirrored Eclipse it is desired to make sure that the direction of <file name> and [<file path>] components is preserved. 


Case 2: Message with placeholder and constructed text - defect 222927 - BIDI3.4:HCG Incorrect layout of file name containing Hebrew characters in Search view 

a. A message with placeholders:
 ''{U0}'' - {U1} matches found inside  ''{U2}'' ({U3})
We need to enforce the LTR direction for U3 placeholder. 

b. Constructed text - line item
  {U0} - {U1} {U2} 
 Since the line item is not a sentence formulated in natural language the overall direction of the line item should be always LTR. 
 In addition please notice that in this case we have 3 insertion units:
  U0 = File name - should always have LTR direction 
  U1 = File path - should always have LTR direction 
  U2 = Number of matches - should has RTL direction since it is a translated piece of text  and can be seen as a small sentence on its own

The solution - design
======================
 The solution for enforcing direction for a string is very simple:
    a. To enforce LTR direction: myText = LRE + LRM + myText + LRM + PDF 
    b. To enforce LTR direction: myText = LRE + LRM + myText + LRM + PDF
 This solution works around the limitation of presentation engine and handles RTL direction as well as LTR.
 This solution was outlined in the 3rd version of updated design for complex expressions published at: https://bugs.eclipse.org/bugs/attachment.cgi?id=96407 as part of defect 179191. The author of this document is Bidi architect Mati Allouche.

The solution - implementation
=============================
 Of course developers can add control characters manually. However, it would be much more convenient to have a function doing the work as part of TextProcessor.  
 The suggested solution implements the design outlined in section 3.8 Message with Placeholders in design document published at: https://bugs.eclipse.org/bugs/attachment.cgi?id=96407. It assumes adding following code to the org.eclipse.osgi.util.TextProcessor class.

  Add following constants for markers

// right to left marker
private static final char RLM = '\u200f';
	
// right to left  embedding
private static final char RLE = '\u202b';	
 
Add following constants for direction

//left to right text direction  
public static final String DIRECTION_LTR = "LTR";
    
//right to left text direction
public static final String DIRECTION_RTL = "RTL"; 


Add following function to enforce direction of input string to either LTR or RTL based on the explicitly provided direction value

/** 
     
  * Changes the direction of the string according to parameter direction
  *   For left to right direction added LRE followed by LRM at the beginning 
  * of the string and LRM followed by PDF ad the end of the string   
  *   For right to left  direction added RLE followed by RLM at the beginning 
  * of the string and RLM followed by PDF ad the end of the string
  * 
  *	   @param str
  *            the text to process, if <code>null</code> return the string 
  *            as it was passed in
  *
  *        @param directoin     
  *            the string direction is enforced to be LTR if parameter  
  *            direction is DIRECTION_LTR and      
  *            enforced to RTL if parameter direction is DIRECTION_RTL  
  *            if the parameter is neither DIRECTION_LTR not DIRECTION_RTL 
  *            return the string as it was passed in 
  * 	 
  *        @return the processed string
*/  

public static String enforceDirection(String str, String direction) {

  // This function does meaningfull work under the same conditions on which 
  //    process function is invoked.
  if (str == null || str.length() <= 1 || !isSupportedPlatform || !isBidi || direction == null)
     return str;  

  // Make minimal verification on the input parameters
  if (!direction.equals(DIRECTION_LTR) && !direction.equals(DIRECTION_RTL))
        return str; 

  StringBuffer target = new StringBuffer();
  if (direction.equals(DIRECTION_LTR)){           
      target.append(LRE);
      target.append(LRM);
      target.append(str);
      target.append(LRM);
  } else {
      target.append(RLE);
      target.append(RLM);
      target.append(str);
      target.append(RLM);
  }
  target.append(PDF);
  return target.toString();    
} 


 Add following function to enforce direction of input string to either LTR or RTL based on the guessed direction value (if string includes at least one Bidi char the direction is considered RTL, otherwise it is considered LTR). 
 To verify presence of Bidi characters in the input string, this function uses 
another function: isRTL which is already part of TextProcessor class.

/** 
  * Changes the direction of the string according to presence of Bidi characters
  * If the string contains at least one RTL character, set the string direction 
  * RTL, else set the string direction LTR  
  *   For left to right direction added LRE followed by LRM at the beginning of 
  * the string and LRM followed by PDF ad the end of the string
  *   For right to left  direction added RLE followed by RLM at the beginning 
  * of the string and RLM followed by PDF ad the end of the string
  * 
  * @param str
  *            the text to process, if <code>null</code> return the string
  *            as it was passed in
  * 
  * @return the processed string
  */  
    
public static String enforceDirection(String str) {

   if (str == null || str.length() <= 1 || !isSupportedPlatform || !isBidi)
      return str;  

   char ch;
   boolean isRTLDirection = false;
		
   for (int i = 0, n = str.length(); i < n; i++) {
     ch = str.charAt(i);
     if (Character.isLetter(ch)) {
         if (isRTL(ch)) {
              isRTLDirection = true;
              break;
         }
     }
   }

   return enforceDirection(str, isRTLDirection ? DIRECTION_RTL : DIRECTION_LTR);    
}
Comment 1 Dani Megert CLA 2008-04-28 03:01:37 EDT
Moving to 
Comment 2 Dani Megert CLA 2008-04-28 03:02:22 EDT
Maybe the original owner (Tod?) can take a look?
Comment 3 Tod Creasey CLA 2008-04-28 07:50:07 EDT
We won't be doing this in 3.4 as we are feature frozen
Comment 4 Thomas Watson CLA 2008-04-28 07:59:18 EDT
Tod, would you be able to look into this in 3.5?
Comment 5 Thomas Watson CLA 2008-04-28 10:07:30 EDT
CC'ing Felipe for BIDI help.  Felipe could you provide some help/review of this (not for 3.4).
Comment 6 Tomer Mahlin CLA 2008-04-29 02:45:53 EDT
 I just wanted to correct a mistake made in the "The solution - design" section. In the following paragraph you can see that item b is identical to item a:
   a. To enforce LTR direction: myText = LRE + LRM + myText + LRM + PDF 
   b. To enforce LTR direction: myText = LRE + LRM + myText + LRM + PDF
 While item a describes a solution for enforcement of LTR direction, item b was supposed to describe a solution for RTL direction enforcement. Thus the correct version of those items is as follows:

    a. To enforce LTR direction: myText = LRE + LRM + myText + LRM + PDF 
    b. To enforce RTL direction: myText = RLE + RLM + myText + RLM + PDF

  I appreciate very much the feedback I got from Mati who first spotted the mistake above.
Comment 7 Thomas Watson CLA 2009-05-05 17:48:23 EDT
Not going to do anything here for 3.5.
Comment 8 Thomas Watson CLA 2009-12-02 16:20:14 EST
I think any such enhancements should be done in the proposal in bug 183164 which is a layer above the framework (org.eclipse.equinox.bidi or something like that).  It is a mistake to keep tacking things into the core framework for this.
Comment 9 Tomer Mahlin CLA 2009-12-03 03:10:34 EST
I agree. Indeed the proposal for bug 183164 has basic capability to enforce base text direction. I think we still need a utility function to make the usage convenient for "Messages with placeholders" case. I will add a comment on this to bug 183164.
Comment 10 Thomas Watson CLA 2010-01-08 13:59:35 EST
Moving to components since this will likely be build upon the bundle coming out of bug 183164.
Comment 11 Thomas Watson CLA 2011-04-28 08:56:17 EDT
bug183164 is not targeting 3.7, so I am dropping the milestone of this dependent bug.