Community
Participate
Working Groups
When fetching artifacts from a source repository to store in the local agent artifact repository cache or the eclipse touchpoint bundle pool, the table of contents files (artifacts.xml) are read and written for each copied artifact. With ~4000 artifacts, this process takes a long time (~16 hours in my last experiment, when the source repository is on the same drive as the cache) Anecdotally a very big part of the time consumed was in reading/writing the articles.xml files. Pausing in the debugger showed code in or downstream from the thoughtworks XStream everytime I paused (a relatively smalll sample). Lots of the breaks where in reflection. We need to investigate: a). Reducing reads/writes of the toc files (at least always write, but read only if inconsistent with last write) b). See if there are simple things to do with xstream to cache there parser/emitter state so it is not recomputed by reflection with every use.
Added performance keyword.
Note that a less verbose file format would dramatically reduce the size of the file - it's just a list of four-tuples (namespace,id,qualifier,version), so it would be very easy to cook up a serialization that doesn't use Xstream. A more radical possibility: I wrote an artifact repository implementation that didn't even have a TOC file. There's nothing in the artifact repo API that forces it to have such a table of contents, as long as it can determine whether it has a given artifact in a reasonable amount of time.
This is really an issue of the save lifecycle on the repo. We should look at this for M3
*** Bug 213616 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 244628 ***