Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [babel-dev] Release strategy needed

Hi,

I follow this list since some time but wasn't very active yet in i18n in
Eclipse. I'm a experienced German (Debian) translator and act also as
translation coordinator for a few Open Source projects (written in C/C++,
using gettext).

Let me write from my gettext based Unix viewpoint, I hope this is OK :-)

On Wed, Oct 15, 2008 at 10:20:05AM -0400, Denis Roy wrote:
> Please witness this bug, where a single 'bad' translation caused errors  
> in Eclipse: https://bugs.eclipse.org/250734

("{0} ({1})" was bogusly translated to "{0}{{1}}")

I'm very surprised such errors can happen. Gettext based translations,
which are the de-facto standard in the Open Source (Unix) world for
C/C++ but also shell, PHP, Python, Perl and other languages uses PO
files to handle translations. According to the manual page, there is
also support for Java. I never worked with it on Java projects, though.
Such a file looks like this:

msgid ""
msgstr ""
"Project-Id-Version: aptitude 0.2.15.5\n"
"Report-Msgid-Bugs-To: aptitude@xxxxxxxxxxxxxxxxxxx\n"
"POT-Creation-Date: 2008-05-05 13:22+0200\n"
"PO-Revision-Date: 2008-05-05 13:24+0200\n"
"Last-Translator: Jens Seidel <jensseidel@xxxxxxxxxxxx>\n"
"Language-Team: Debian German <debian-l10n-german@xxxxxxxxxxxxxxxx>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=n!=1;\n"

#: src/apt_config_treeitems.cc:279
#, c-format
msgid "Editing \"%ls\""
msgstr "Bearbeite »%ls«"

#: src/apt_config_treeitems.cc:401
msgid "%BChoice:%b  "
msgstr "%BWahl:%b  "

The message "Editing \"%ls\"" uses a format string %ls which has to
match. Gettext's tool msgfmt which converts the PO text file into a binary
hash file just checks whether the format matches (the message is
recognized by xgettext's source code parser as a "c-format" string) and
just fails with an error if not.

This means that invalid PO files will never be used. It is of course possible
that a translation is wrong, contains typos or that the program which
uses the translation isn't well written and doesn't except text much
larger than the English text (e.g. by using a buffer of fixed size) but
in general a valid PO translation file *never* causes any harm.

So I suggest you just add a similar check to your Java format strings.

I'm curious, why do you invent basic i18n and l10n tools again? Gettext
based tools are tested since decades in ten thousends of programs. There
exist tools for simple editing (kbabel, gtranslator, ...), to call spell
checkers, to collect translations in large compendium files (so that
there is no need to always retranslate "File") (yep, the license has to
match). There exist also Web frontends such as Pootle (don't know it
well).

Is it because everything has to be based on Java and accessible from
Eclipse?

I agree that you did a very good job. The web frontend looks great and
it simplifies collaboration. Well done! But have you considered to
reuse existing projects or to share your work for other translation
attempts?

I only once tried to translate an Eclipse plugin (Subclipse) and it was
hard. The file to be translated wasn't even a simple text file
(contained a funny encoding) and did not even contained English strings
as template.

> I envision a consistent, predictable release cycle similar to this:
>
> 2. We build/use a testing framework to ensure the language packs are  
> *functional*... If any 'bugs' are fixed, we rebuild, and this becomes RC2

As already mentioned it should be probably sufficent to have some
translation syntax checker. I do not remember that any (non-Java) project
I know about ever profited from i18n tests. Such test do of course not
harm :-)

> 3. We ask language champions (and the community at large) to review the  
> language packs ... If any bad translations need correcting, they are  
> fixed.    3.1 If the 'bad' translations are deemed stop-ship, we start 
> over from step 1, rebuild and call this RC3.  We try to avoid this, as 
> this will introduce new translations from the community until Step 0 is  
> implemented.

I know from the Ubuntu Linux distribution that the quality of web based
translations (without required authorisation) can be very, very bad. So
I think that language champions are a very good idea.

Jens


Back to the top