Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] Re: WorkingTreeIterator - the writing side

Christian Halstrick <christian.halstrick@xxxxxxxxx> wrote:
> 
> by having different implementations of WorkingTreeIterator we allow
> jgit-users to define how content in the working tree is accessed. This
> works great for the reading side - but the writing side is not covered
> yet.

No, that's a huge gaping hole.

> How do we allow jgit-users to specify how working tree content
> should be written. When jgit is used on server-side (gerrit) java file
> I/O may not be optimal.

Its not even necessarily optimal in EGit.  If you do changes through
the IResource APIs, the workbench can more easily let builders and
other plugins know that resources were changed, so they can react
more quickly to the update.

> What do you think: is WorkingTreeIterator also a good place to have
> writing abstractions (e.g. by adding a openEntryOutputStream()
> method)?

I'm starting to think so too.

> Don't we get into trouble when in the middle of a walk
> somebody starts writing new content? We have to support deletion and
> creation of files also.

We shouldn't.  The WorkingTreeIterator caches the entire directory
contents before the TreeWalk is able to acquire an entry.  Therefore
changes won't really be reflected.  Though that caching is limited,
obviously we don't compute and cache the ObjectIds for every file.
We might not even cache their lengths or timestamps up front.
But we do promise to cache the list of name strings.


Just thinking out loud...

Our first problem is what happens when there is no working tree
item for the current path of a TreeWalk?  In this case you cannot
get the WorkingTreeIterator.  And if you did, its current entry is
the name past the current entry of the TreeWalk.  So how do we get
a handle on a WorkingTreeIterator that can create a new entry for us?

And if we do get a handle on it, things get really hairy for new
directory creation.  Assuming its a new directory, we'll want to
dive into that new directory via TreeWalk.enterSubtree() after
its been created.  To do that the WorkingTreeIterator will need to
insert the new entry in its cached entries member.

Lets assume we've solved those issues magically.  Then we probably
want an API like this:

  class WorkingTreeIterator {
    /**
     * Replace the current entry with a blob contents.
     */
    void writeEntryBlob(FileMode newMode, ObjectLoader ldr) throws IOException;

    /**
     * Replace the current entry with a symbolic link.
     */
    void writeEntrySymlink(String target) throws IOException;

    /**
     * Replace the current entry with an empty directory.
     *
     * If the current entry is a file or symbolic link, the file or link
     * is first deleted before being replaced with an empty directory.
     */
    void makeEntryDirectory() throws IOException;

    /**
     * Delete the entry.
     *
     * If the entry is a directory, it must be empty.  If it still
     * has contents, an exception is thrown.
     */
    void deleteEntry() throws IOException;
  }

So how do we get the WorkingTreeIterator to position itself onto
an entry that doesn't yet exist?

Perhaps we modify TreeWalk a tad:

  class TreeWalk {

    public <W extends WorkingTreeIterator> W getOrCreateTree(
                int nth, Class<W> clazz, FileMode mode) {
      AbstractTreeIterator t = trees[nth];
      if (t.matches == currentHead)
        return t;

      WorkingTreeIterator w = (WorkingTreeIterator)t;
      w.insertNewEntry(getNameString(), mode);
      w.matches == currentHead;
      return (W)w;
    }
  }

and implement some sort of insertNewEntry(String, FileMode) API on
WorkingTreeIterator to inject the missing entry.


This isn't fully generic.  One of the things I don't like
is writeEntryBlob is defined in terms of an ObjectLoader.
That is intentional because the implementation can invoke
ObjectLoader.copyTo(OutputStream) and possibly avoid some unnecessary
copying during output.  However its awkward to use for your merge
code for example, its a tad difficult to write out a conflicted
file contents.

If you have the contents as a byte[], you can hand it off to
ObjectLoader.SmallObject and let that be the argument.  Or,
implement ObjectLoader yourself and use ObjectStream.Filter and
pass in a normal InputStream.

We could implement some of this as overloads:

  public final void writeBlobEntry(FileMode mode, byte[] content) {
    writeBlobEntry(mode, new ObjectLoader.SmallObject(OBJ_BLOB, content));
  }

  public final void writeBlobEntry(FileMode mode,
      final byte[] content, final int off, final int cnt) {
    writeBlobEntry(mode, new ObjectLoader() {
      public int getType() {
        return OBJ_BLOB;
      }

      public long getSize() {
        return cnt;
      }
 
      public byte[] getCachedBytes() {
        throw new LargeObjectException();
      }

      public void copyTo(OutputStream out) throws IOException {
        out.write(content, off, cnt);
      }

      public ObjectStream openStream() {
        return new ObjectStream.Filter(OBJ_BLOB, cnt,
          new ByteArrayInputStream(content, off, cnt));
      }
    });
  }

-- 
Shawn.


Back to the top