Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] git server on appengine?

On Mon, Jun 13, 2011 at 00:43, pablo platt <pablo.platt@xxxxxxxxx> wrote:
> It seems that jgit recently added support for the appengine data store
> http://egit.eclipse.org/r/#change,2295

Its not the AppEngine data store. Its data stores that have similar
characteristics as the AppEngine data store. Actually, its data stores
that have even fewer features than the AppEngine data store. :-)

But yes, if someone wrote the glue necessary to link the DHT's SPI
interfaces to the AppEngine data store, and wrote the schema
necessary, it would be possible to try and host JGit on AppEngine. I
can't promise it works out of the box right now. I know for a fact
that the "server side" code in JGit spawns a timer thread to handle
progress message timers for connected clients. As far as I know,
thread creation is still not supported on AppEngine from within Java
runtimes. So this section of code probably needs to be modified to use
a different timer management approach.

The DHT code tries to use prefetching on background threads to improve
performance. This is configurable in the DHT SPI glue, so you can
certainly implement an AppEngine data store glue that runs everything
sequentially in the current thread, but the calling conventions will
be a bit weird because you have to remember to invoke the
AsyncCallback's onSuccess or onFailure method before returning from
the service routine. :-)

> Is it possible to use jgit to host a private repository on appengine?

Sure, you could try to do this. Most of what you need is in the DHT
code you pointed to. I would love to see an AppEngine SPI provided to
the JGit project.

> Are there jgit/appengine limitations that prevents using it in release
> scenarios
> like time limit on requests to appengine

Probably. I thought I read somewhere that AppEngine lifted the time
limit on requests. What I don't know is how they handle the response
back to the client. If they store-and-forward the response, you will
run into response limit problems if the repository size is on the
large side, say 32 MiB. Most store-and-forward HTTP servers don't
expect payloads bigger than few MiB. I hope they are streaming the
response these days.

Requests... I can tell you (but I cannot tell you why!) that a large
push will be a problem on AppEngine. Pushing 128 MiB (about 25% of the
linux-2.6 kernel repository) will probably not be successful, there
are limits on the POST request size that AppEngine accepts.

> or maximum object size in the
> appengine datastore ?

Yes, this is a problem. IIRC the AppEngine limit is 1 MB per
object/row in the data store. This is also the JGit default for the
DhtInserterOptions. Because of really good delta compression on tree
objects, its possible for a tree chunk to use about 150% of the chunk
size due to the SHA-1 index that is stored alongside of the tree data.
You have two options here when mapping onto AppEngine. The first is to
simply decrease the chunk size to 512 KiB (rather than the default of
1 MiB), making it more likely the row will fit within the 1 MiB limit.
The other option is to two use two rows, and store the ChunkIndex data
in a different row from the ChunkData itself. The DhtInserterOptions
should also have a setting to control the number of objects in the
chunk, that limit controls the size of the ChunkIndex, as the
ChunkIndex is worst case 1026 header bytes followed by 24 bytes per
object.

I would probably try to keep the ChunkIndex and ChunkData in the same
object/row. These are needed together. The AppEngine data store takes
~20 ms to load one object. Storing them and loading them together is
probably faster than having a bigger ChunkData and waiting for two
objects to load.

-- 
Shawn.


Back to the top