301903 – Consider a file level lock during write operations on the artifact repository

Bug 301903 - Consider a file level lock during write operations on the artifact repository

Summary: Consider a file level lock during write operations on the artifact repository

Status:	RESOLVED FIXED

Alias:	None

Product:	Equinox
Classification:	Eclipse Project
Component:	p2 (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P3 normal (vote)
Target Milestone:	3.7 M6
Assignee:	Ian Bull
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-02-04 19:01 EST by Ian Bull
Modified:	2011-03-05 01:32 EST (History)
CC List:	2 users (show)

See Also:

Attachments
Patch v1 (15.85 KB, patch) 2011-02-14 23:33 EST, Ian Bull	no flags	Details \| Diff
mylyn/context/zip (24.49 KB, application/octet-stream) 2011-02-14 23:33 EST, Ian Bull	no flags	Details
Patch v2 (22.29 KB, patch) 2011-02-18 16:19 EST, Ian Bull	no flags	Details \| Diff
mylyn/context/zip (690.96 KB, application/octet-stream) 2011-02-18 16:19 EST, Ian Bull	no flags	Details
Patch v3 (110.78 KB, patch) 2011-03-03 01:31 EST, Ian Bull	no flags	Details \| Diff
mylyn/context/zip (977.05 KB, application/octet-stream) 2011-03-03 01:31 EST, Ian Bull	no flags	Details
Patch v4 (82.21 KB, patch) 2011-03-03 17:29 EST, Ian Bull	no flags	Details \| Diff
mylyn/context/zip (221.59 KB, application/octet-stream) 2011-03-03 17:29 EST, Ian Bull	no flags	Details
Patch v5 (99.88 KB, patch) 2011-03-04 16:40 EST, Ian Bull	no flags	Details \| Diff
mylyn/context/zip (831.00 KB, application/octet-stream) 2011-03-04 16:40 EST, Ian Bull	no flags	Details
Patch v6 (95.99 KB, patch) 2011-03-04 22:16 EST, Ian Bull	no flags	Details \| Diff
Show Obsolete (9) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ian Bull

2010-02-04 19:01:20 EST

In Bug 274025 we added batch execution. We might want to consider a file level lock during the write.

Comment 1 Ian Bull

2011-02-14 23:33:37 EST

Created attachment 188971 [details]
Patch v1

This is the outline of the solution.  This locks and loads before each repository modification method.  After the modification, the lock is released.

Comment 2 Ian Bull

2011-02-14 23:33:39 EST

Created attachment 188972 [details]
mylyn/context/zip

Comment 3 Ian Bull

2011-02-18 16:19:42 EST

Created attachment 189325 [details]
Patch v2

Here is an updated patch that works much better.  This implements Lock&Load on write, and releases the lock afterwards. I also implemented timestamp check for the load (only load if it changes).  There are two outstanding problems:  

1. We don't load on all read operations. For example, getProperty we may have to load from disk as this might have changed. I've done this for getDescriptors, so we just have to do the same thing for the other operations.

2. The 'set' operations (add descriptor, set property, etc...) don't take progress monitors. However, this might be very long running operations (if the repo is locked by another process).  We should consider adding new method that take progress monitors and delegate the old methods using a NullProgressMonitor (for backwards compatibility).

A few other issues:
1. I need more test cases.
2. The timestamp check is accurate to within a second on Linux.  If two processes change the same file within a second, one process may not notice the change.  This is because getLastModified is only good to within a second.  How is this usually handled?  Should we also use filesize?

Comment 4 Ian Bull

2011-02-18 16:19:53 EST

Created attachment 189326 [details]
mylyn/context/zip

Comment 5 Ian Bull

2011-03-03 01:31:20 EST

Created attachment 190238 [details]
Patch v3

This essentially completes the work. Here is what I've done:

1. Added new methods to IArtifactRepository to include progress monitors for anything that might block. I deprecated the old methods, and put a note to use the new ones. In our AbstractBaseClass, I delegated all the old methods to the new ones (with a NullProgressMonitor).  Since we have a @noImplement on our interface, this should not break anyone.

2. Whenever you call a method that will 'write' to the repository, we 'lock and load'. 

3. Whenever you call a method that will 'read' the repository, we check if anyone has the repository locked, and if not we load it. If they do, we block (if we can, i.e. we have a progress monitor).

If we are not in a position to block, then we don't and we simply use whatever cached version we have.  I've given this a lot of thought and I don't think this is too bad. The point of this bug report was really to stop ourselves from overwriting artifact repositories with multiple processes.  The operations won't fail, but you may get back your cached version if someone else is currently writing.  If you really want multiple readers and multiple writers at the same time (and you expect this to work with fine-grained timing), then you should wrap your operations in an execute batch. This will grab a lock and hold it until you are done.  

I might change this behaviour slightly, and block for a fixed (short) period of time, but I don't think we should block indefinitely.

4. I added some test cases that demonstrate a few different things.

Pascal, can you take a look at this from an API view?  Are you happy with this approach?

Some things I still want to do:
1. Add more test cases
2. Right now we use file timestamps to see if we are up-to-date. This is accurate to 1 second. I would like to improve this. 

Neither of these require API changes, and this approach (IMHO) is much better than we we currently have.

Comment 6 Ian Bull

2011-03-03 01:31:36 EST

Created attachment 190239 [details]
mylyn/context/zip

Comment 7 Jeff McAffer

2011-03-03 17:20:50 EST

Ian and I chatted about the current approach.  Some notes
- blocking on read is likely unnecessary as long as the data returned is coherent.  For most of the reading methods they can simply check that they are up to date and return what they have.  If the repo is unlocked the memory copy can be updated.  If it is locked, just use what is in memory since the repo is in the middle of being modified.  This keeps the read behaviour simple to explain, understand and use.  Also means we don't need the progress monitors in so many places which makes the change and use simpler.

- executeBatch() synchronizes which means that only one thread can execute at a time when really what we want is deferred save.  We could remove the sync and simply count batch enter/exit and save when the defer count is 0 (whatever).

- Must look at the file locking mechanism used.  If it is thread-relative then we are essentially synchronizing within the VM and between VMs.  Really we want to synchronize on a per-agent basis.  Each agent should have their own in memory copy of the repo and the SimpleArtifactRepo implementation keeps that data structure coherent.  Then each in-memory copy needs to be kept in sync with the disk.

- fixing the artifact repo is great and will help bundle pooling but it will not help publishing (and perhaps not mirroring and friends).  Publisher batches metadata changes and writes them all at the end bit AFAIK it does not take care to make sure that the repo is up to date so we can get lost updates.  It should be relatively easy to do the lock and load approach on metadata repos as well

- The timestamp granularity issue likely is real. We must have addressed this other places.  What did we do?  Scenarios I know of today have multiple (e.g., 10) machines publishing at the same time so they will easily have multiple changes in the same timestamp granularity.  Today the whole thing is synchronized but that is less than optimal for sure.

Comment 8 Ian Bull

2011-03-03 17:29:03 EST

Created attachment 190323 [details]
Patch v4

This patch removes the 'load on read' -- to be specific, if someone has the location locked, then we don't load on read.  If the location is not locked, and we're not up-to-date, then we load.

Comment 9 Ian Bull

2011-03-03 17:29:07 EST

Created attachment 190324 [details]
mylyn/context/zip

Comment 10 Ian Bull

2011-03-03 17:37:48 EST

(In reply to comment #7)
Thanks for the feedback Jeff.

> Ian and I chatted about the current approach.  Some notes
> - blocking on read is likely unnecessary as long as the data returned is
> coherent.  For most of the reading methods they can simply check that they are
> up to date and return what they have.  If the repo is unlocked the memory copy
> can be updated.  If it is locked, just use what is in memory since the repo is
> in the middle of being modified.  This keeps the read behaviour simple to
> explain, understand and use.  Also means we don't need the progress monitors in
> so many places which makes the change and use simpler.

Fixed in the latest patch.

> 
> - executeBatch() synchronizes which means that only one thread can execute at a
> time when really what we want is deferred save.  We could remove the sync and
> simply count batch enter/exit and save when the defer count is 0 (whatever).

I know we talked about this today, and I mentioned the alternative of adding a lock that you must acquire at the beginning of execute batch (or synchronize on another lock).  This means that other threads can call methods (add artifact descriptor etc..) while someone is running an execute batch, but only 1 thread can call execute batch at one time. I think this is interesting because you can now use execute batch to syncrhonize between multiple processes (using the file level lock we acquire), and between threads in a single process (using this lock).  Jeff, you mentioned you were against this, but I haven't wrapped my head around why this is a bad idea.

> 
> - Must look at the file locking mechanism used.  If it is thread-relative then
> we are essentially synchronizing within the VM and between VMs.  Really we want
> to synchronize on a per-agent basis.  Each agent should have their own in memory
> copy of the repo and the SimpleArtifactRepo implementation keeps that data
> structure coherent.  Then each in-memory copy needs to be kept in sync with the
> disk.
Yes, there really should only be 1 instance of any SimpleArtifactRepository in memory.  

> 
> - fixing the artifact repo is great and will help bundle pooling but it will not
> help publishing (and perhaps not mirroring and friends).  Publisher batches
> metadata changes and writes them all at the end bit AFAIK it does not take care
> to make sure that the repo is up to date so we can get lost updates.  It should
> be relatively easy to do the lock and load approach on metadata repos as well
Yes, I agree.  

> 
> - The timestamp granularity issue likely is real. We must have addressed this
> other places.  What did we do?  Scenarios I know of today have multiple (e.g.,
> 10) machines publishing at the same time so they will easily have multiple
> changes in the same timestamp granularity.  Today the whole thing is
> synchronized but that is less than optimal for sure.
Tom suggested we use a timestamp in a file (we can put that at the lock location which is ${artifact_repo_location}/.artifactlock.  I used this because we have an artifact repository at the root of an eclipse install, and I didn't want to claim this 'OSGi Location'... I assume this place is sacred.

Comment 11 Ian Bull

2011-03-03 23:54:10 EST

Pascal asked me about the performance impact of this change. This is a little hard to measure since we don't have any performance tests.  I used a simple test that added X number of artifact descriptors to an artifact repo.  I set X to 1,000 and 10,000.

1,000 Artifact Descriptors Added
================================
Before Lock:
Total time: 2301
Total time: 26   <-- executeBatch

After Lock: 
Total time: 2623
Total time: 35   <-- executeBatch

10,000 Artifact Descriptors Added
=================================
Before Lock (2 runs):
Total time: 159741
Total time: 249      <-- executeBatch

Total time: 154931
Total time: 209     <-- executeBatch


After Lock (2 runs):
Total time: 162821
Total time: 198     <-- executeBatch

Total time: 159583
Total time: 248     <-- executeBatch

These tests don't mean too much, as I simply ran the on my machine.  I ran them a few times, and the performance of the lock isn't too bad (maybe a few seconds on an operation that takes 2 1/2 minutes). However, the one thing these tests illustrate is that we really should be using the executeBatch / deferSave mechanism.

Comment 12 Jeff McAffer

2011-03-03 23:54:53 EST

(In reply to comment #10)
> I know we talked about this today, and I mentioned the alternative of adding a
> lock that you must acquire at the beginning of execute batch (or synchronize on
> another lock).  This means that other threads can call methods (add artifact
> descriptor etc..) while someone is running an execute batch, but only 1 thread
> can call execute batch at one time. I think this is interesting because you can
> now use execute batch to syncrhonize between multiple processes (using the file
> level lock we acquire), and between threads in a single process (using this
> lock).  Jeff, you mentioned you were against this, but I haven't wrapped my
> head around why this is a bad idea.

"against" is a little harsh but  in general I like consistent semantics.  If executeBatch is meant to exclude others from changing (e.g,. locking) then it should exclude all others. If it is meant to defer saves, then it need not block anyone else (even other batches).  The semantics you suggest are a mix of these two. I'm not against that but would like to see a justifying usecase for the additional semantic complexity.

> > - Must look at the file locking mechanism used.  If it is thread-relative then
> > we are essentially synchronizing within the VM and between VMs.  Really we want
> > to synchronize on a per-agent basis.  Each agent should have their own in memory
> > copy of the repo and the SimpleArtifactRepo implementation keeps that data
> > structure coherent.  Then each in-memory copy needs to be kept in sync with the
> > disk.
> Yes, there really should only be 1 instance of any SimpleArtifactRepository in
> memory.  

Just to be clear, one in-memory instance per repo manager (i.e., agent).  One vm (i.e., OS process) may have many agents in it.

> Tom suggested we use a timestamp in a file (we can put that at the lock

We must have done something like this before.  the file table stuff we use in the registry is a friend but likely overkill.  Just looking to not reinvent the wheel...

> location which is ${artifact_repo_location}/.artifactlock.  I used this because
> we have an artifact repository at the root of an eclipse install, and I didn't
> want to claim this 'OSGi Location'... I assume this place is sacred.

Can you give an example of where this file goes?  seems like it would be in the root of the eclipse install if you were pointing to the normal bundle pool.

Comment 13 Jeff McAffer

2011-03-04 00:02:14 EST

(In reply to comment #11)
Thanks for the numbers.  Can you clarify their meaning?  Before/After lock?  You say the locking might add a couple seconds.  Assuming the numbers are milliseconds I don't see anything that shows 2000ms.  AFAICT on this small sample set the conclusion should be the that the performance is the same.

Comment 14 Ian Bull

2011-03-04 00:35:11 EST

(In reply to comment #12)
> "against" is a little harsh but  in general I like consistent semantics.  If
> executeBatch is meant to exclude others from changing (e.g,. locking) then it
> should exclude all others. If it is meant to defer saves, then it need not
> block anyone else (even other batches).  The semantics you suggest are a mix of
> these two. I'm not against that but would like to see a justifying usecase for
> the additional semantic complexity.

Thanks for the clarification and this is a terrific point.  I can't articulate a good argument for the different semantics.  I am trying to engineer a solution where two threads (maybe the same agent, maybe not) are trying to do something complicated (GC and provision for example) on the same repository.  In this case, they *could* each use executeBatch.  But in reality, these operations are not likely to happen simultaneously by the same agent, and if they are, then I think it's fair to say it's not up to the SimpleArtifactRepository to stop it.  So in short, thanks Jeff and I agree with you. :-)

> 
> > > - Must look at the file locking mechanism used.  If it is thread-relative then
> > > we are essentially synchronizing within the VM and between VMs.  Really we want
> > > to synchronize on a per-agent basis.  Each agent should have their own in memory
> > > copy of the repo and the SimpleArtifactRepo implementation keeps that data
> > > structure coherent.  Then each in-memory copy needs to be kept in sync with the
> > > disk.
> > Yes, there really should only be 1 instance of any SimpleArtifactRepository in
> > memory.  
> 
> Just to be clear, one in-memory instance per repo manager (i.e., agent).  One
> vm (i.e., OS process) may have many agents in it.

Right, good point. So the test cases I've included in the patch essentially do this. I create multiple threads and access different instances of the same repository (same file location, different in memory instance).  I don't know if this behaves different on different platforms.  

> 
> > Tom suggested we use a timestamp in a file (we can put that at the lock
> 
> We must have done something like this before.  the file table stuff we use in
> the registry is a friend but likely overkill.  Just looking to not reinvent the
> wheel...
Yep. I'll see if I can find anything.

> 
> > location which is ${artifact_repo_location}/.artifactlock.  I used this because
> > we have an artifact repository at the root of an eclipse install, and I didn't
> > want to claim this 'OSGi Location'... I assume this place is sacred.
> 
> Can you give an example of where this file goes?  seems like it would be in the
> root of the eclipse install if you were pointing to the normal bundle pool.

If /home/irbull/eclipse is the root of my eclipse install, then 
/home/irbull/eclipse/.artifactlock is the location we lock 
and the actual lock file is at:
/home/irbull/eclipse/.artifactlock/.metadata/.lock

Comment 15 Ian Bull

2011-03-04 00:54:49 EST

(In reply to comment #13)
> (In reply to comment #11)
> Thanks for the numbers.  Can you clarify their meaning?  Before/After lock? 
> You say the locking might add a couple seconds.  Assuming the numbers are
> milliseconds I don't see anything that shows 2000ms.  AFAICT on this small
> sample set the conclusion should be the that the performance is the same.

Sorry. That does look confusing.

Before Lock == before I applied the patch that adds locking
After Lock == after I applied the patch that adds locking

I ran the case where I added 10,000 descriptors twice.

So the second run (before I applied the patch) took 154931 and the first run (after I applied the patch) took 162821.  There is difference of 7,890ms (almost 8 seconds).  But there are some cases where after I applied the patch the performance improved, so I think it comes down to what else was going on with my machine at the time.  In the end, there isn't a big change in the performance.

During the second run of the 10,000 artifact add, I left my machine alone so those are probably the best numbers to look at:

154931 vs. 159583 (4.6 seconds or 0.46ms per addArtifactDescriptor).

Comment 16 Jeff McAffer

2011-03-04 01:50:51 EST

(In reply to comment #14)
> > Can you give an example of where this file goes?  seems like it would be in the
> > root of the eclipse install if you were pointing to the normal bundle pool.
> 
> If /home/irbull/eclipse is the root of my eclipse install, then 
> /home/irbull/eclipse/.artifactlock is the location we lock 
> and the actual lock file is at:
> /home/irbull/eclipse/.artifactlock/.metadata/.lock

hmmm, a new file in the root of the install. Not sure what to do about this.  Feels very "in your face"

Comment 17 Ian Bull

2011-03-04 02:02:18 EST

(In reply to comment #16)
> > /home/irbull/eclipse/.artifactlock/.metadata/.lock
> 
> hmmm, a new file in the root of the install. Not sure what to do about this. 
> Feels very "in your face"

A new (hidden?) 'directory' in the root of the install, but your point is taken. We talked about this a few weeks ago on the Equinox call, but no real alternative made sense.

Comment 18 Jeff McAffer

2011-03-04 03:01:33 EST

If we are going to open a file and read something then why not just use the artifacts.xml?  Put it in one othe xml processing entries at the beginning.

Or, set the file timestamp if there is a collision.

There must be a way we can do this without laying down another file.  Especially in the root of the install.

Comment 19 Ian Bull

2011-03-04 10:46:44 EST

(In reply to comment #18)
> If we are going to open a file and read something then why not just use the
> artifacts.xml?  Put it in one othe xml processing entries at the beginning.
> 
> Or, set the file timestamp if there is a collision.
> 
> There must be a way we can do this without laying down another file. 
> Especially in the root of the install.

I wouldn't put it at the root of the install, but rather in the .artifactlock directory, but I guess that's not the point.

The only reason I was thinking of a different file was to save the cost of re-scanning the artifacts.xml (or more likely its compressed version, artifact.jar).  But IIRC, the jar format can be parsed sequentially, so we could simply read up-to the timestamp, and stop there.  Is that what you were thinking?

Comment 20 Ian Bull

2011-03-04 16:40:11 EST

Created attachment 190448 [details]
Patch v5

I found a problem when loading a locked repo. I was doing all the locking inside simple artifact repository, but we also have to manage the locks in the SimpleRepositoryIO (where the repository is read). This patch fixes that problem.  Now, if you cancel the 'load' operation you get the appropriate operation cancelled operation.

I also added some more test cases.  

We still need the more fine grained timestamp (that Jeff was talking about), and there are a few properties (name, description, etc...) which are not properly shared between instances of the repository. However, these are small problems and do not affect the way the repository is being used today. I would like to push on this for this weekends warm-up builds, as that will give me more time to test that we haven't broken anything.  The outstanding issues don't impact API, so they can be addressed early next week or even in M7.

Pascal, what do you think of the API changes here?

Comment 21 Ian Bull

2011-03-04 16:40:24 EST

Created attachment 190449 [details]
mylyn/context/zip

Comment 22 Ian Bull

2011-03-04 22:16:03 EST

Created attachment 190461 [details]
Patch v6

I realized that we were not setting the timestamp when we first loaded the repository, resulting in an unnecessary read/parse when the first method was called.  I've fixed this in the latest patch.

Comment 23 Ian Bull

2011-03-04 22:19:16 EST

All the tests pass (including the reconciler and UI tests).  I've released the code so we can get the most testing time. I will keep an eye on the warm-up build this weekend, and test as soon as it's available.  There are still improvements to be made, but this provides a framework for us to work in now. I will file new bugs for the outstanding work and mark them here.

Jeff and Pascal, thanks for all the feedback (both here and on skype) :-).

Comment 24 Ian Bull

2011-03-05 01:28:16 EST

I've filed the following bugs for the remaining work.  

338986: Use a more accurate method to determine up-to-date checks for artifact caches
https://bugs.eclipse.org/bugs/show_bug.cgi?id=338986

338988: Some artifact repository fields do not lock and load when modified
https://bugs.eclipse.org/bugs/show_bug.cgi?id=338988

338994: Consider a file level lock when we write the metadata
https://bugs.eclipse.org/bugs/show_bug.cgi?id=338994

338987: Remove the synchronization on executeBatch
https://bugs.eclipse.org/bugs/show_bug.cgi?id=338987

Comment 25 Ian Bull

2011-03-05 01:32:13 EST

I've updated the title of the bug to better reflect what was actually done.