Bug 184538 - [prov] [repo] Copying an artifact from source repo to agent cache repo reads and writes toc for each artifact
Summary: [prov] [repo] Copying an artifact from source repo to agent cache repo reads ...
Status: RESOLVED DUPLICATE of bug 244628
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: ---   Edit
Assignee: P2 Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
: 213616 (view as bug list)
Depends on:
Blocks: 249167
  Show dependency tree
 
Reported: 2007-04-27 19:42 EDT by Dave Stevenson CLA
Modified: 2009-08-25 22:49 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Stevenson CLA 2007-04-27 19:42:43 EDT
When fetching artifacts from a source repository to store in the local agent artifact repository cache or the eclipse touchpoint bundle pool, the table of contents files (artifacts.xml) are read and written for each copied artifact. With  ~4000 artifacts, this process takes a long time (~16 hours in my last experiment, when the source repository is on the same drive as the cache)

Anecdotally a very big part of the time consumed was in reading/writing the articles.xml files. Pausing in the debugger showed code in or downstream from the thoughtworks XStream everytime I paused (a relatively smalll sample). Lots of the breaks where in reflection.

We need to investigate:

   a). Reducing reads/writes of the toc files (at least always write, but read only if inconsistent with last write)
   b). See if there are simple things to do with xstream to cache there parser/emitter state so it is not recomputed by reflection with every use.
Comment 1 Dave Stevenson CLA 2007-04-27 19:43:39 EDT
Added performance keyword.
Comment 2 John Arthorne CLA 2007-04-30 10:01:06 EDT
Note that a less verbose file format would dramatically reduce the size of the file - it's just a list of four-tuples (namespace,id,qualifier,version), so it would be very easy to cook up a serialization that doesn't use Xstream.

A more radical possibility: I wrote an artifact repository implementation that didn't even have a TOC file. There's nothing in the artifact repo API that forces it to have such a table of contents, as long as it can determine whether it has a given artifact in a reasonable amount of time.
Comment 3 Jeff McAffer CLA 2007-09-17 16:45:58 EDT
This is really an issue of the save lifecycle on the repo. We should look at this for M3
Comment 4 Simon Kaegi CLA 2007-12-21 10:19:16 EST
*** Bug 213616 has been marked as a duplicate of this bug. ***
Comment 5 Pascal Rapicault CLA 2009-08-25 22:49:21 EDT

*** This bug has been marked as a duplicate of bug 244628 ***