Bug 90803 - org.eclipse.osgi.framework.msg.MessageFormat does not handle double-quote conversion
Summary: org.eclipse.osgi.framework.msg.MessageFormat does not handle double-quote con...
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Runtime (show other bugs)
Version: 3.1   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: 3.1 M7   Edit
Assignee: DJ Houghton CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-04-08 10:29 EDT by Tobias Widmer CLA
Modified: 2005-04-15 12:49 EDT (History)
10 users (show)

See Also:


Attachments
Benchmark for MessageFormat.format vs. NLS.bind (2.46 KB, text/plain)
2005-04-11 11:33 EDT, John Arthorne CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Widmer CLA 2005-04-08 10:29:11 EDT
I20050405-0800:

Resource bundles which have been converted to the new format using 
org.eclipse.osgi.util.NLS#bind(...) do not work anymore with the common double-
quote "escaping".

The NLS class internally uses a custom implementation of MessageFormat, which 
only provides very limited functionality. Clients using some of the advanced 
features of the java.text.MessageFormat class are broken as well.
Comment 1 Tobias Widmer CLA 2005-04-08 10:29:59 EDT
Sorry, wrong component
Comment 2 Olivier Thomann CLA 2005-04-08 10:35:56 EDT
The conversion should handle the format of the properties files that was fine
for the processing by java.text.MessageFormat.
This class exists in fondation libraries, so I don't see why it is not directly
used.
Right now jdt/core cannot convert to new format, because after the conversion
more than 300 tests are failing.
Set severity to blocker.
Comment 3 Mike Wilson CLA 2005-04-08 11:24:57 EDT
We need to make the one in org.eclipse.osg.framework feature for feature (bug for bug) compatible 
with the one in java.text, which could most easily be done by calling it internally.

This bug needs to be fixed asap, since we want all teams to have enough time to convert without 
causing end-of-cycle stress.
Comment 4 Olivier Thomann CLA 2005-04-08 12:06:49 EDT
The java.text.MessageFormat could go into the exceptions.jar file. There is
already some classes from the java.text package in it.
Comment 5 DJ Houghton CLA 2005-04-08 12:29:38 EDT
OSGi implements its own MessageFormat class because it needs to run on a set of
class libraries which doesn't include java.text.MessageFormat. This class is
internal and spec'd to only perform the simple substitution of int arguments and
not the full spec of java.text.MessageFormat. 

The fact that it doesn't handle the escaping of double quotes (which is used
quite commonly) is a bug and will be fixed. We do not plan on implementing the
full spec of java.text.MessageFormat in the internal MessageFormat class. 

That being said, here is what we are going to do:

1). Fix osgi.MessageFormat to handle double-quoted strings.
2). Update the spec of NLS#bind to better define the cases we handle.
3). We recommend if clients use the full capabilities of the MessageFormat class
then they call it directly rather than calling NLS#bind. (which is really just a
convenience method and not essential...you can use NLS for your message bundles
and not use NLS#bind)
Comment 6 Nick Edgar CLA 2005-04-08 13:17:02 EDT
Does the conversion tool detect strings where this may be an issue?
For example, the as-yet-to-be-converted Problems and Tasks views have
expressions like:

problem.statusSummaryVisible = {0,choice,0#0 items|1#{0,number,integer}
item|1<{0,number,integer} items}: {1}
problem.statusSummaryBreakdown = {0,choice,0#0 errors|1#{0,number,integer}
error|1<{0,number,integer} errors}, {1,choice,0#0 warnings|1#{1,number,integer}
warning|1<{1,number,integer} warnings}, {2,choice,0#0 infos|1#{2,number,integer}
info|1<{2,number,integer} infos}
problem.statusSummarySelected = {0,choice,0#0 items|1#{0,number,integer}
item|1<{0,number,integer} items} selected: {1}
marker.statusSummarySelected = {0,choice,0#0 items|1#{0,number,integer}
item|1<{0,number,integer} items} selected.

Which is a fancy way of saying: use "items" if N==0 or N>1, and "item" if N==1.
Comment 7 DJ Houghton CLA 2005-04-08 14:54:23 EDT
No, I don't believe the conversion tool interprets the values when converting
the message properties files.
Comment 8 DJ Houghton CLA 2005-04-08 18:21:51 EDT
Released fix to HEAD and updated javadoc.

Leaving bug open until javadoc can be reviewed on Monday.
Comment 9 Dirk Baeumer CLA 2005-04-11 06:04:24 EDT
Couldn't OSGI runtime detect which JRE is available and if
java.text.MessageFormat exist simply call this one instead of the OSGI
implementation.
Comment 10 DJ Houghton CLA 2005-04-11 07:31:03 EDT
Yes we thought about that but if did that then the behaviour of NLS#bind would
not be consistent. 
Comment 11 Dirk Baeumer CLA 2005-04-11 08:32:27 EDT
So what's about a method format then which forwards to MessageFormat if present
or does nothing otherwise. This can be doced as well. I think for most of the UI
plug-ins the method bind can't be used due to its limitations.
Comment 12 Dani Megert CLA 2005-04-11 08:42:43 EDT
I second Dirk's comment: currently this method is not usable for most plug-ins.
Using MessageFormat if it's available would fix the problem. Clients who want to
use the method need to read the Javadoc anyway to understand its limitations.
Therefore it could simply have the Javadoc as described in comment 5 for the
case where java.text.MessageFormat isn't available.
Comment 13 Thomas Watson CLA 2005-04-11 09:14:58 EDT
Re comment 11

If your plugin has dependencies on the MessageFormat functionality is it 
better to call the static methods on MessageFormat directly?  If the 
MessageFormat class is not present and the new NLS convenience method does 
nothing what does that mean for your plugin?  Would it stop working?  I 
imagine we would just return the original String with no substitions.

In the absence of java.text.MessageFormat both approaches will result in some 
kind of error.  If you called MessageFormat directly then you would get 
NoClassDefFoundErrors (and likely stop working).  Using a convenience method 
on NLS would result in strings that are not formated correctly (but I imagine 
we will still continue to run).

It seems like the second approach is better as long as we keep a form of the 
current NLS#format method available which embedded applications can call on < 
foundation profiles (the OSGi framework would still need the old format method 
as well).
Comment 14 Jeff McAffer CLA 2005-04-11 09:30:18 EDT
I lost track of first vs second proposals here so will simply state my 
thoughts...

- bind() is a convenience method there for people who don't want to create 
their own String[] for the arguments and have simple formatting requirements.  
It is fast and small.

- MessageFormat is big and slow.

- NLS is in OSGi and must run on the OSGi minimum execution environment 
(basically Gateway+) which does not include MessageFormat

- the use of the new message bundle story does not require the use of bind()

- people previously using ResourceBundles were directly calling MessageFormat.

- adding optional support for MessageFormat (if present) is confusing.  In the 
end, for correct operation, people must assume the worst (i.e., it is not 
present) and thus use only simple formatting when calling bind().  The optional 
support does not buy you anything.

From this it seems like we should just stick with bind() as the simple 
formatter for people with simple needs.  People with more complex needs should 
use the traditional MessageFormat approach.  Teams are free to use 
MessageFormat all the time as they were before and never even think about this 
problem.

Downgrading as there are conventional coding patterns for handling this 
situation.
Comment 15 John Arthorne CLA 2005-04-11 10:13:05 EDT
Dirk and Dani: you mentioned that the #bind method won't be sufficient for most
UI plugins.  I just wanted to clarify that we have added support for
single-quote escaping as specified in MessageFormat.  The only thing we still
don't support is the advanced {Index,FormatType,FormatStyle} facilities.  The
marker view messages mentioned in comment #6 are the only ones I am aware of
that use this.  For the rare cases that need it, just stick with the slow &
expensive java.text.MessageFormat class.
Comment 16 Dirk Baeumer CLA 2005-04-11 10:29:14 EDT
I am more in favour of always using MessageFormat as we did in the past.
Otherwise  you always have to think about whether the format is support or not.
If you change a value in a property file to use some "extended" features you
have even to update the code then.

If NLS will not support full message formats then I opt for deprecate/remove
bind and make the message format class API. This will give a consistent story
for all NLS clients.

Comment 17 John Arthorne CLA 2005-04-11 11:33:47 EDT
Created attachment 19759 [details]
Benchmark for MessageFormat.format vs. NLS.bind

Attached is a little benchmark I wrote to compare performance of
MessageFormat.format versus our NLS.bind method. Here are the results of
formating a message with various numbers of arguments (using NLS from HEAD with
single quote escaping added):

No arguments: NLS is 226% faster than MessageFormat
One argument: NLS is 164% faster than MessageFormat
Two arguments: NLS is 168% faster than MessageFormat

Where performance is important, there is still value in using the simplified
NLS.bind, even though it is not as powerful as MessageFormat.
Comment 18 Dirk Baeumer CLA 2005-04-12 04:51:40 EDT
Some answers to the previous comments:

- the advantage of having a 'full' bind method would be the single point of 
  indirection we would then have in Eclipse to create formatted message. So if
  we want to change something in the future here it would may be easier. That's
  why I asked for it.

- regarding use of the bind method: I don't think that we have too many strings
  that will not be correctly formatted using NLS#bind. But we are simply don't 
  know them. That's why I think plug-ins with many strings (mostly UI
  plug-ins) might not use NLS#bind().

To give clients more confidence in using NLS#bind I see two solutions:

- we provide a tool that checks property files.
- we provide a fast MessageFormat class with the limitations of NLS#bind which
  throws an exception or returns null if it doesn't suppport the format. Then
  clients could format those strings using java.text.MessageFormat.

I traced the startup of Eclipse 3.0.2 to see how much time we spent in
MessageFromat and it is below 1% (it doesn't show up in the trace at all).
However, if we can easliy profit form NLS#bind I think we should take the
performance improvement.
Comment 19 John Arthorne CLA 2005-04-13 09:57:09 EDT
I agree with Dirk's second solution.  To prevent the possibility of
accidentially introducing message formats that don't work with NLS#bind, the
bind methods should throw IllegalArgumentException if passed a message pattern
that uses the advanced FormatStyle and FormatType arguments. I.e., we should
throw an exception for any message that is legal under MessageFormat but not
supported by NLS#bind.  We currently just add "<missing argument>" to the
formatted string in these cases, but an exception would make the limitation of
NLS#bind more obvious.
Comment 20 Nick Edgar CLA 2005-04-13 10:00:58 EDT
I also think the conversion tool should complain if it encounters any format
strings that won't be handled by NLS.bind.
Comment 21 DJ Houghton CLA 2005-04-15 12:39:49 EDT
Ok, I have updated the spec and impl for NLS such that it throws an
IllegalArgumentException when being called with complex formatting strings.

Tobias, is the conversion tool able to be modified as per Nick's suggestion in
comment #20?
Comment 22 Tobias Widmer CLA 2005-04-15 12:49:17 EDT
Yes, this is possible