Bug 214752 - Concurrency problem with getRevision()
Summary: Concurrency problem with getRevision()
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.core (show other bugs)
Version: 1.0   Edit
Hardware: PC Windows XP
: P1 major (vote)
Target Milestone: ---   Edit
Assignee: Eike Stepper CLA
QA Contact:
URL:
Whiteboard:
Keywords: readme
Depends on:
Blocks:
 
Reported: 2008-01-09 09:52 EST by Simon Mc Duff CLA
Modified: 2010-06-29 09:21 EDT (History)
1 user (show)

See Also:


Attachments
Patch v1 (25.17 KB, patch)
2008-01-19 05:58 EST, Eike Stepper CLA
no flags Details | Diff
Patch v2 (2.70 KB, patch)
2008-01-19 09:13 EST, Eike Stepper CLA
no flags Details | Diff
mylyn/context/zip (16.10 KB, application/octet-stream)
2008-01-19 09:13 EST, Eike Stepper CLA
no flags Details
Patch v3 (45.60 KB, patch)
2008-01-19 09:45 EST, Eike Stepper CLA
no flags Details | Diff
Patch v4 (59.99 KB, patch)
2008-01-19 12:36 EST, Eike Stepper CLA
no flags Details | Diff
Patch v5 (59.71 KB, patch)
2008-01-19 12:41 EST, Eike Stepper CLA
no flags Details | Diff
Update test case + patch in MEMAccessor (5.07 KB, patch)
2008-01-19 22:12 EST, Simon Mc Duff CLA
no flags Details | Diff
Update (7.80 KB, patch)
2008-01-19 23:02 EST, Simon Mc Duff CLA
no flags Details | Diff
Update v3 (9.36 KB, patch)
2008-01-20 06:46 EST, Eike Stepper CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Mc Duff CLA 2008-01-09 09:52:27 EST
I got many CDODuplicateRevisionException.. and I'm only in read mode.

Problem:

In CDORevisionResolverImpl.java 

protected CDORevisionImpl getRevision(CDOID id, int referenceChunk, boolean loadOnDemand)

revision = loadRevision(id, referenceChunk);
<< CHANGE CONTEXT - Anythoer process could ask for the same revision
addRevision(revision);


We should :
We should always catch CDODUplicateException when we addRevision except in RevisionManager

or

Add synchronized method at the right spot.
Comment 1 Eike Stepper CLA 2008-01-17 11:36:14 EST
Committed to CVS.

I've overwritten addRevision in tzhe client side r-manager to just log the exception and not propagate it. In rare cases this approach (in contrast to additional synchronization) can lead to redundant requests being sent concurrently. But a simply synchronized block would limit concurrency just too much since it can't recognize if the same CDOID and version is to be requested. There are also cases where even a smarter sync method can't predict if the requested revisions are equal or not. I'd prefer to stay with simply ignoring duplicate request results. Ok?
Comment 2 Eike Stepper CLA 2008-01-17 11:37:25 EST
Committed to CVS
Comment 3 Simon Mc Duff CLA 2008-01-17 13:35:45 EST
(In reply to comment #1)
> Committed to CVS.
> I've overwritten addRevision in tzhe client side r-manager to just log the
> exception and not propagate it. In rare cases this approach (in contrast to
> additional synchronization) can lead to redundant requests being sent
> concurrently. But a simply synchronized block would limit concurrency just too
> much since it can't recognize if the same CDOID and version is to be requested.
> There are also cases where even a smarter sync method can't predict if the
> requested revisions are equal or not. I'd prefer to stay with simply ignoring
> duplicate request results. Ok?

It happens on the server as well. 
What do we do for that case ?
Comment 4 Eike Stepper CLA 2008-01-18 06:13:43 EST
Damn, I seem to still be in jet lag! I suggest the following:
- We remove the CDODuplicateRevisionException completely from the protocol plugin
- The CDORevisionResolver.addRevision() method instead returns the revision already found in the cache (just in case the caller needs the real handle, he can test passed-revision != returned-revision)
- We only use the CDODuplicateRevisionException in the cdo.server plugin between RevisionManager and IStoreWriter:
		public void IStoreWriter.writeRevision(CDORevisionImpl) throws CDODuplicateRevisionException
Comment 5 Simon Mc Duff CLA 2008-01-18 07:10:39 EST
(In reply to comment #4)
> Damn, I seem to still be in jet lag! I suggest the following:
> - We remove the CDODuplicateRevisionException completely from the protocol
> plugin
> - The CDORevisionResolver.addRevision() method instead returns the revision
> already found in the cache (just in case the caller needs the real handle, he
> can test passed-revision != returned-revision)
> - We only use the CDODuplicateRevisionException in the cdo.server plugin
> between RevisionManager and IStoreWriter:
>                 public void IStoreWriter.writeRevision(CDORevisionImpl) throws
> CDODuplicateRevisionException

This will be perfect!! 


Also, when we adjust(add-remove) RevisionHolder (double-linked lists);
I think we can screw up things there as well.

What will happen when we add two RevisionHolder for the same revisions ?

addRevision 
adjustHolder

Comment 6 Eike Stepper CLA 2008-01-19 05:58:35 EST
Created attachment 87319 [details]
Patch v1

RollbackTests are failing
UI is blocking on InvalidationIndication
Comment 7 Eike Stepper CLA 2008-01-19 09:10:42 EST
The RollbackTests failed due to a bug in MEMStore/MEMStoreAccessor:
A duplicate revision exception must not be thrown in IStoreAccessor.release()
I've fixed that.

Now ChunkingWithMEMTest.testWriteNative() fails with:

Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v2
Thread-1 [CDORevisionImpl] Setting version: v1 -> v1
XXX
[ERROR] Concurrent modification of revision Customer@42v1
java.lang.IllegalStateException: Concurrent modification of revision Customer@42v1
	at org.eclipse.emf.cdo.internal.server.MEMStore.addRevision(MEMStore.java:110)
	at org.eclipse.emf.cdo.internal.server.MEMStoreAccessor.writeRevisionDelta(MEMStoreAccessor.java:141)
	at org.eclipse.emf.cdo.internal.server.RevisionManager$AddRevisionDeltaOperation.phase1(RevisionManager.java:318)
	at org.eclipse.emf.cdo.internal.server.RevisionManager$AddRevisionDeltaOperation.phase1(RevisionManager.java:1)
	at org.eclipse.net4j.internal.util.transaction.Transaction.execute(Transaction.java:66)
	at org.eclipse.emf.cdo.internal.server.RevisionManager.addRevisionDelta(RevisionManager.java:67)
	at org.eclipse.emf.cdo.internal.server.protocol.CommitTransactionIndication.writeRevisions(CommitTransactionIndication.java:286)
	at org.eclipse.emf.cdo.internal.server.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:123)
	at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:46)
	at org.eclipse.net4j.signal.Signal.runSync(Signal.java:143)
	at org.eclipse.net4j.signal.Signal.run(Signal.java:124)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
	at java.lang.Thread.run(Thread.java:595)

With a breakpoint on CDORevisionImpl.setVersion() I could detect that the call "v1 -> v1" comes from CDORevisionDeltaImpl.apply(CDORevisionImpl). I've no idea yet why it doesn't set the version to 2. Two questions:

a) Why do you call     revision.setVersion(dirtyVersion < 0 ? -dirtyVersion : dirtyVersion);
    and not simply revision.setVersion(dirtyVersion);
    
b) Why do you (in the stream ctor)
    dirtyVersion = originRevision.getVersion();
    originVersion = dirtyRevision.getVersion();
	and not
	dirtyVersion = dirtyRevision.getVersion();
    originVersion = originRevision.getVersion();
	
Comment 8 Eike Stepper CLA 2008-01-19 09:13:08 EST
Created attachment 87326 [details]
Patch v2
Comment 9 Eike Stepper CLA 2008-01-19 09:13:15 EST
Created attachment 87327 [details]
mylyn/context/zip
Comment 10 Simon Mc Duff CLA 2008-01-19 09:38:34 EST
> yet why it doesn't set the version to 2. Two questions:
> a) Why do you call     revision.setVersion(dirtyVersion < 0 ? -dirtyVersion :
> dirtyVersion);
>     and not simply revision.setVersion(dirtyVersion);

I looked and doesn't seem to be required.

> b) Why do you (in the stream ctor)
>     dirtyVersion = originRevision.getVersion();
>     originVersion = dirtyRevision.getVersion();
>         and not
>         dirtyVersion = dirtyRevision.getVersion();
>     originVersion = originRevision.getVersion();

You are right !!! Good catch !!
Comment 11 Eike Stepper CLA 2008-01-19 09:45:08 EST
Created attachment 87328 [details]
Patch v3

The failure in ChunkingWithMEMTest.writeNative() disappeared. No idea why ;-(

v3 makes tests with multiple sessions fail.
Comment 12 Simon Mc Duff CLA 2008-01-19 09:46:14 EST
(In reply to comment #7)
> The RollbackTests failed due to a bug in MEMStore/MEMStoreAccessor:
> A duplicate revision exception must not be thrown in IStoreAccessor.release()
> I've fixed that.

When it is the appropriate time to write our changes to the database ?
I used release as the commit...

If we cannot, I believe we should have a new method to terminate our changes and apply it to the database.

IStoreWriter.commit();

What do you think ?
Comment 13 Eike Stepper CLA 2008-01-19 09:54:06 EST
 (In reply to comment #12)
> When it is the appropriate time to write our changes to the database ?
> I used release as the commit...
That's not a good idea. Look into CommitTransactionIndication:

   try
    {
      StoreUtil.setReader(storeWriter);
      newPackages = readNewPackages(in, storeWriter);
      newResources = readNewResources(in, storeWriter);
      newObjects = readNewObjects(in, storeWriter);
      dirtyObjects = readDirtyObjects(in, storeWriter);
      ITransaction<IStoreWriter> storeTransaction = new Transaction<IStoreWriter>(storeWriter, false);

      try
      {
        addPackages(storeTransaction, newPackages);
        addRevisions(storeTransaction, newResources);
        addRevisions(storeTransaction, newObjects);
        writeRevisions(storeTransaction, dirtyObjects);
        // addRevisions(storeTransaction, dirtyObjects);

        storeTransaction.commit();
      }
      catch (RuntimeException ex)
      {
        OM.LOG.error(ex);
        rollbackMessage = ex.getLocalizedMessage();
        storeWriter.rollback(view, storeTransaction);
      }
    }
    finally
    {
      storeWriter.release();
      StoreUtil.setReader(null);
    }

An exception from release() is handled differently!!

> If we cannot, I believe we should have a new method to terminate our changes and
> apply it to the database.
> 
> IStoreWriter.commit();
I've fixed it for MEMStore already (without new API).
Do you need it for your OODBStore?
This would be new post-of-the-fact trigger I mentioned in bug #214526.
And it doesn't relate to this bug, we should wait until we have a solution for bug #214526.
Comment 14 Eike Stepper CLA 2008-01-19 09:59:03 EST
Damn, patch v3 contains the same failure as v2.
I had a previous launch with a server hanging around ;-(
Killing that removed the 3 test failures (multiple sessions) I described before and unveiled the bug that manifests itself in ChunkingWithMEMTest.writeNative().

Hmm....
Comment 15 Simon Mc Duff CLA 2008-01-19 10:00:39 EST
Transaction management needs a beginning and an end.
I know many relational database and Object database that work like that ? 
I'm surprised that your database doesn't work like that.

In our case, when can a transaction commit its changes to the database ?

A commit could fail for any reason. Commit takes time... you don't want to do it each time we call writeRevision.

If you have another places where I can plug my stuff to commit and maybe throwing an exception.. please let me know how. Otherwise.. I believe it will be good to have a IStoreWriter commit.


Comment 16 Simon Mc Duff CLA 2008-01-19 10:04:00 EST
(In reply to comment #14)
> Damn, patch v3 contains the same failure as v2.
> I had a previous launch with a server hanging around ;-(
> Killing that removed the 3 test failures (multiple sessions) I described before
> and unveiled the bug that manifests itself in
> ChunkingWithMEMTest.writeNative().
> Hmm....

Before you are leaving or if you don't want to look at this bug.. let me know.

Tonight I will look at it.

Let me know how to reproduce it and which patch I need to apply.

Bye!
Comment 17 Eike Stepper CLA 2008-01-19 12:33:32 EST
 (In reply to comment #15)
I've added a commit() method to IStoreWriter.
Comment 18 Eike Stepper CLA 2008-01-19 12:36:54 EST
Created attachment 87329 [details]
Patch v4

This version passes all tests.
The problem was in MEMStoreAccessor.readRevision() where there was a corrupt copy created. I've commented the cloning out.

Please verify if it is needed and has to be corrected!!!

But the UI is still hanging on arrival of a conflicting InvalidationIndication ;-(
Comment 19 Eike Stepper CLA 2008-01-19 12:41:13 EST
Created attachment 87330 [details]
Patch v5

Problem persists...
Comment 20 Eike Stepper CLA 2008-01-19 13:08:00 EST
Found the problem: In the editor at some places Display.syncExec() was used instead of asyncExec(). Fixed.

Committed to CVS.
Comment 21 Simon Mc Duff CLA 2008-01-19 22:12:53 EST
Created attachment 87343 [details]
Update test case + patch in MEMAccessor

I updated RollbackTestCase.

When we add through MEMStore.addRevision we could have an exception.
If this happen we need to ensure that all the other revision we added are removed.
We can put some code in the rollback or we verify at the writeRevision. Before I was verifying in the write.. I was adding everything at the end.

I will personnally prefer to use a copy of the readRevision since I don't want to impact the revisionManager to Garbage collect some objects.

I will update my code at work about the commit.. it will simplify the code!!! Thank you again.
Comment 22 Simon Mc Duff CLA 2008-01-19 23:02:02 EST
Created attachment 87344 [details]
Update

Even if MEMStore doesn't use commit anymore... I will with my oodb.
Comment 23 Eike Stepper CLA 2008-01-20 06:46:43 EST
Created attachment 87348 [details]
Update v3

Simon, why have you changed MEMStoreAccessor to not use commit() anymore?
Comment 24 Simon Mc Duff CLA 2008-01-20 09:31:46 EST
(In reply to comment #23)
> Created an attachment (id=87348) [details]
> Update v3
> Simon, why have you changed MEMStoreAccessor to not use commit() anymore?

I had two choice... I can go either of them.

At the commit time, if I add my revisions to the store... and the version is already there I need to rollback. I will need to remove from the list only object that I successfully added. SO I will need another list.. and use it in the rollback. Make sense ?

The other solution was to check at writeRevision.

Don't worry, the commit is useful. In production so far it happens twice that we had an exception in the IStoreWriter.release. (we are doing the commit there)... so we had to stop the CDOServer and restart it.


Comment 25 Eike Stepper CLA 2008-01-20 11:17:57 EST
(In reply to comment #24)
> At the commit time, if I add my revisions to the store... and the version is
> already there I need to rollback. I will need to remove from the list only
> object that I successfully added. SO I will need another list.. and use it in
> the rollback. Make sense ?
I'd suggest to make all major methods (reads and writes) synchronized to ever have only one store accessor at a time. Then you can iterate the newRevisions twice, first check all then add all or nothing. Or you iterate with an array index, add newRevisions until one fails. Then you remove all the previous revisions.


> The other solution was to check at writeRevision.
Also possible if there is only one reader or writer at a time.

 
> Don't worry, the commit is useful. In production so far it happens twice that
> we had an exception in the IStoreWriter.release. (we are doing the commit
> there)... so we had to stop the CDOServer and restart it.
That's what I said about different exception handling ;-) 

Comment 26 Eike Stepper CLA 2008-04-07 09:36:29 EDT
Any idea if this one can be closed?
Comment 27 Simon Mc Duff CLA 2008-04-07 11:54:59 EDT
(In reply to comment #26)
> Any idea if this one can be closed?

Yes since we return weither true or false when it is already there!
Comment 28 Eike Stepper CLA 2008-04-07 14:52:02 EDT
cleanup
Comment 29 Eike Stepper CLA 2008-04-19 04:48:34 EDT
Fixed in S200804140606
Comment 30 Eike Stepper CLA 2008-06-10 02:28:45 EDT
Reversioned due to graduation