[egit-dev] Re: [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default im

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[egit-dev] Re: [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default implementation

From: "Shawn O. Pearce" <spearce@xxxxxxxxxxx>
Date: Mon, 12 Oct 2009 07:57:42 -0700
Delivered-to: egit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/egit-dev>
List-help: <mailto:egit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mutt/1.5.17+20080114 (2008-01-14)

imyousuf@xxxxxxxxx wrote:
> The SPI mainly focus's in providing an API to JGit to be able to perform
> similar operations to that of java.io.File. All direct I/O is based on the
> java.io.Input/OutputStream classes.
> 
> Different JGit IO SPI provider is designed to be URI scheme based and thus
> the default implementation is that of "file" scheme. SPI provider will be
> integrated by their respective users in a manner similar to that of JDBC
> driver registration. There is a SystemStorageManager that has similar
> registration capabilities and the system storage providers should be
> registered with the manager in one of the provided ways.

I think this may be a bit in the wrong direction for what we are
trying to accomplish.

A number of people really want to map Git onto what is essentially
Google's BigTable schema.  Aside from Google's own BigTable product
(which I want to use Git on at work, because it would vastly simplfiy
my system administration duties at $DAYJOB) there is Cassandra and
Hadoop HBase which implement the same schema semantics.

None of those systems implement file streams, they implement cell
storage in a non-transactional system with a semi-dynamic schema.

Some people have built transactional semantics on top of these
storage layers, e.g. Google AppEngine provides multiple row
transactions through some magic sauce layered on top of BigTable.
I'm sure people will build similar tools on top of Cassandra
and HBase.

Where I'm trying to go with this is that things that are stored
in files on the filesystem in traditional Git wouldn't normally be
mapped into "byte streams" in a BigTable-ish system, or even the
JDBC-ish system you were describing.

For .git/config we might want to map config variable names into
keys in the table, with values stored in cells.  This makes it
easier to query or edit the data.

Fortunately, "Config" is abstract enough that we could subclass
it with a CassandraConfig and simply use that instance when on a
based Cassandra storage system.  No file streams required.  Ditto
for a JdbcConfig.

For RefDatabase, we'd want to do the same and avoid the concept of
packed-refs altogether.  Each Ref should go into its own row in a
Cassandra storage system, and essentially act as a loose object.
Ditto with JDBC.

We'd probably never need to read-or-write the info/refs or
objects/info/packs listings.

And I think that's everything that a bare repository needs, aside
from ObjectDatabase, which is already mostly abstract anyway.

-- 
Shawn.

Follow-Ups:
- [egit-dev] Re: [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default implementation
  - From: Imran M Yousuf

References:
- [egit-dev] [JGit-io-RFC-PATCH v2 1/4] Introduce a new module for IO SPI of JGit
  - From: imyousuf
- [egit-dev] [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default implementation
  - From: imyousuf

Prev by Date: [egit-dev] [Sample-Conversion-PATCH 2/2] Introduce IO SPI usage in RepositoryConfig
Next by Date: [egit-dev] Re: [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default implementation
Previous by thread: [egit-dev] [JGit-io-RFC-PATCH v2 4/4] Add locking capability to the IO SPI based on Java Concurrency Lock API
Next by thread: [egit-dev] Re: [JGit-io-RFC-PATCH v2 2/4] Add JGit IO SPI and default implementation
Index(es):
- Date
- Thread

Breadcrumbs