Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [babel-dev] php string diffs for saving changes to document translations

Matt had a look at how MediaWiki does this.

MediaWiki stores the entire contents of the file on each save.  This seems a bit excessive to me, but they must have a good reason for doing this.  This means getting the latest version is always easy. When you want to see the changes between two revisions, MediaWiki simply runs their custom diff engine on the two complete revisions.  If the file is not a Wiki markup file, it seems to use the shell for diff.

Since there is no turnkey solution for doing this, here are a couple of solutions I can think of for Babel.

1. We use an external VCS as Antoine suggested.  I think this is an entirely valid approach, although perhaps a bit more complex.

2. We do like MediaWiki -- save the entire file on each save.  If it works for them, it will work for us.  Except:
- when a user wants to diff two versions, we could use one of the diff libraries Gabe pointed out.  No need to go to shell or write our own.
- older revisions could pass through the gzip lib and be stored in compressed format.  If a user wants to diff to an older version, we unzip it first, then pass it through the diff engine.
- one design consideration here would be to use many tables -- perhaps one translation table per language.  Or, one table to contain the 'latest' of everything and one table per language for the gzipped older versions.


Thoughts?




On 03/19/2010 10:55 AM, Antoine Toulme wrote:
A wiki like approach sounds good. Thanks for the explanations!

On Fri, Mar 19, 2010 at 07:38, Denis Roy <denis.roy@xxxxxxxxxxx> wrote:
I'll agree that when dealing with entire files, a VCS would be ideal.  However, interfacing with one from PHP may add much (unneeded) complexity.  Also, our simple LAMP application would then require the usage of an external VCS.

I fail to see how the usage of a PHP diff library is "reinventing" any more than using an external VCS.  We don't necessarily want to control versions here -- we just want to cut on the storage size by saving deltas. 

I may be missing something, though, or I may be misunderstanding how you would implement/interface with a VCS from PHP.  As for the editors, let's keep that as a separate subject.  The problem we want to solve here is storage.

The two solutions Gabe has enumerated look OK to me, but I'd have to try them out to be sure.  One thing that concerns me, however, is the need to re-play the entire transaction set if we want to see the latest file.  This is perhaps where Antoine's VCS idea comes in to play, since this is what it is designed to do.

But I'll say this again:  we should be looking at how MediaWiki is doing this, since we are doing the exact same thing.  I will enlist the help of my co-webmaster Matt, since he knows MediaWiki code quite well.

Denis




On 03/16/2010 08:23 PM, Antoine Toulme wrote:
Hi Gabe,

wouldn't you be better off with a VCS for that ?

I would recommend you also look at Bespin for an online editor that could help.

An other alternative could be to use an editor like the one github uses so that people can make small changes online.

I'm afraid otherwise you reinvent quite a few things to approach the same result.

Now that's my 0.02 c, feel free to proceed with your plan,

Antoine

On Tue, Mar 16, 2010 at 16:22, Gabe O'Brien <gabe.obrien@xxxxxxxxxxx> wrote:
Hello Bablers,

  On our last status call Kit asked me to look into possible solutions to generating diffs for translations of documentation.  The idea is we could save the differences between translations as people work on a file and only save one working copy of the most recent translated document.  The diffs would serve as an audit trail and allow for recreating the translated document if needed.

  I found two possible solution for creating diffs in PHP.  It would be great to get some feedback from Denis on which one would be better from a server standpoint.

Possible Solutions:
  http://php.net/manual/en/book.xdiff.php
  http://pear.php.net/package/Text_Diff

Gabe O'Brien
The Eclipse Foundation
_______________________________________________
babel-dev mailing list
babel-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/babel-dev

_______________________________________________ babel-dev mailing list babel-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/babel-dev


_______________________________________________
babel-dev mailing list
babel-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/babel-dev


_______________________________________________ babel-dev mailing list babel-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/babel-dev



Back to the top