Bug 408138 - [reconciler] Dropins are broken in after master configuration change
Summary: [reconciler] Dropins are broken in after master configuration change
Status: RESOLVED WORKSFORME
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.9.0 Kepler   Edit
Hardware: PC Linux
: P3 critical (vote)
Target Milestone: ---   Edit
Assignee: Krzysztof Daniel CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks: 378329 410710
  Show dependency tree
 
Reported: 2013-05-15 10:47 EDT by Krzysztof Daniel CLA
Modified: 2019-09-04 01:57 EDT (History)
3 users (show)

See Also:
krzysztof.daniel: review? (pascal)


Attachments
Passing test case for the reference. (3.05 KB, patch)
2013-05-16 02:18 EDT, Krzysztof Daniel CLA
no flags Details | Diff
Partial fix for the problem (2.04 KB, patch)
2013-05-16 11:55 EDT, Krzysztof Daniel CLA
no flags Details | Diff
Fix proposition. (3.90 KB, patch)
2013-05-16 12:18 EDT, Krzysztof Daniel CLA
no flags Details | Diff
A Patch with test (7.83 KB, patch)
2013-05-17 08:14 EDT, Krzysztof Daniel CLA
no flags Details | Diff
A Patch with test v2 (8.40 KB, patch)
2013-05-27 07:30 EDT, Krzysztof Daniel CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Krzysztof Daniel CLA 2013-05-15 10:47:55 EDT
Here is a scenario reproduce with Eclipse Kepler M7 20130509-1105 downloaded from eclipse.org.

a) As a superuser, unzip Eclipse
b) As a superuser, put a plugin into dropins
c) As a user, run Eclipse. Notice the plugin being discovered. Shutdown Eclipse.
d) As a superuser, touch config.ini
e) As a user, run Eclipse. Notice the plugin not being present in the PluginsRegistry. Notice errors in the error log (action sets not found).

Dropins will not work anymore until ~/.eclipse is deleted.

Running droppins with debug tracing shows:



[p2] Wed May 15 16:43:00 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Master profile changed.
[p2] Wed May 15 16:43:00 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Performing reconciliation.
[p2] Wed May 15 16:43:01 CEST 2013 - [Start Level Event Dispatcher] [reconciler] [dropins] Interesting feature or bundle added: /home/kdaniel/Downloads/eclipse/dropins/org.ancit.search.sysout_1.0.0.201305091540.jar
[p2] Wed May 15 16:43:01 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Master profile changed.
[p2] Wed May 15 16:43:01 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Performing reconciliation.
[p2] Wed May 15 16:43:01 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Profile timestamp not found in cache.
[p2] Wed May 15 16:43:01 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Performing reconciliation.
[p2] Wed May 15 16:43:02 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Nothing to do.
[p2] Wed May 15 16:43:02 CEST 2013 - [Start Level Event Dispatcher] [reconciler] Writing out timestamps to file : /home/kdaniel/.eclipse/org.eclipse.platform_4.2.0_1677589827_linux_gtk_x86_64/configuration/org.eclipse.osgi/bundles/127/data/cache.timestamps

I wonder why the change is discovered 3 times, but can't say for now whether it is or is not related to the issue.

It also looks like this bug is related for nearly 80% of problems I described in my recent post to p2-dev, because RPM writes platform into master, and various packages into dropins.
Comment 1 Krzysztof Daniel CLA 2013-05-16 02:17:37 EDT
Initial investigation:
The bundle is present in the profile but is not present in the bundles.info.
A test case replaying the reproduction steps passes all green, so it must be sth else cutting in.
Comment 2 Krzysztof Daniel CLA 2013-05-16 02:18:42 EDT
Created attachment 231054 [details]
Passing test case for the reference.
Comment 3 Krzysztof Daniel CLA 2013-05-16 09:45:17 EDT
The issue is most likely introduced by commit bc11d565b32b5b5c8bb5c93e5ea666b1d06b43ff.

Method internalGetProfile got 
+ //This is the first time we create the shared profile. Tag it as such and also remember the timestamp of the base
+ [..]
+ internalSetProfileStateProperty(profile, profile.getTimestamp(), SIMPLE_PROFILE_REGISTRY_INTERNAL + getBaseTimestamp(profile.getProfileId()), getBaseTimestamp(id));

But then, each time whether it is considered whether to reset the profile, the method ignoreExistingProfile checks for:

+ if (internalGetProfileStateProperties(profile, SIMPLE_PROFILE_REGISTRY_INTERNAL + baseTimestamp, false).size() != 0)

Which is true unless the master profile is changed. This is not enough, because  a config.ini can cause user configuration drop, and it should also be a reason to drop the profile.
Comment 4 Krzysztof Daniel CLA 2013-05-16 11:55:45 EDT
Created attachment 231097 [details]
Partial fix for the problem

This patch reestablishes the connection between the profile registry and the configurator, so the profile registry will reset the profile if configurator request it to do so, as it was initially designed.
Comment 5 Krzysztof Daniel CLA 2013-05-16 11:58:21 EDT
Forgot to add - the patch is only "partial" because it fixes dropins to the point that the bundles.info is properly written down, but bundles from dropins are not loaded until restart. I'm investigating this right now.
Comment 6 Krzysztof Daniel CLA 2013-05-16 12:18:31 EDT
Created attachment 231103 [details]
Fix proposition.
Comment 7 Pascal Rapicault CLA 2013-05-16 13:34:32 EDT
Could you please create a patch with both the code change and the test case.
Comment 8 Pascal Rapicault CLA 2013-05-16 13:37:20 EDT
Never mind my last comment about creating a new patch.
Comment 9 Krzysztof Daniel CLA 2013-05-16 13:46:59 EDT
The test case worked because running the reconciler and verifying the result between was removing the drop user configuration property.
Comment 10 Pascal Rapicault CLA 2013-05-16 13:49:44 EDT
Two questions:
- does the test case you've added (testReadOnlyDropinsStartup2) in the first patch covers your scenario?
- why do you need to change reconcileReadOnly?
Comment 11 Krzysztof Daniel CLA 2013-05-16 13:54:28 EDT
The test case is not really to be released. I've attached it only as a reference because it was weird that I was able to reproduce the issue manually, but not using the attached test case.

I suspect it was because of the reconciler app quitting without really refreshing bundles - so there was no need for communication between simple configurator and profile registry.

Making the dir as readonly could play a role, too.

There is no test currently for this bug.
Comment 12 Pascal Rapicault CLA 2013-05-16 14:02:38 EDT
> There is no test currently for this bug.
   could you please create one because otherwise this is poised to happen again.

At this point I don't really want to hold off the release of this, so here is what I'm proposing. I release the fix in RC1 so you can confirm that that everything is all good from your end, and you give me the test patch asap and I will release it for RC2.
Do we have a deal?
Comment 13 Krzysztof Daniel CLA 2013-05-16 14:10:14 EDT
(In reply to comment #12)

> Do we have a deal?

I can offer only my best intentions and time, a few suggestions how to write tests for this issue would definitely help :-). I'll do my best to prepare the test tomorrow!
Comment 14 Pascal Rapicault CLA 2013-05-16 14:21:54 EDT
For test inspiration you can take a look at the test class that you've modified, as well as the test class called BaseChange (and others in the p2.tests.sharedinstall packages) that perform end to end testing of the migration.
Comment 15 Pascal Rapicault CLA 2013-05-16 14:22:41 EDT
And finally one question, the commit comment says "First part of the fix...." but I assume that this is the complete patch right?
Comment 16 Krzysztof Daniel CLA 2013-05-17 01:48:41 EDT
(In reply to comment #15)
I have added a new patch since that comment which fixes all the issue:
https://bugs.eclipse.org/bugs/attachment.cgi?id=231103
Comment 17 Krzysztof Daniel CLA 2013-05-17 08:14:53 EDT
Created attachment 231149 [details]
A Patch with test
Comment 18 Pascal Rapicault CLA 2013-05-17 23:05:24 EDT
At this point this change makes me extremely nervous. I've went away from the system property based approach because I ended up in situations where the profile was being reset multiple times resulting in "profile out of sync" errors.

Of course by now all the details are blurry, but here are two different scenarios where spurious profile resets were appearing:
 - one was caused by the profile being GC'ed from memory, and thus when the profile was reloaded it would be recreated again
 - the other occurred when multiple agents were created for the same p2 location.

The only way I can see this being shipped in Kepler is by putting the change you've done in the ignoreExistingProfile method behind an if, guarded by a system property that you can put in your config.ini product.

Then we can enable the complete patch in Luna which will give us enough time to work out potential problems.

Btw, I think that the problem happens w/o involving the dropin. If you just change the config.ini then the profile does not get reset.


Finally a workaround you could use in your setup is to duplicate the most recent profile under a new timestamp when you modify the config.ini.
Comment 19 Pascal Rapicault CLA 2013-05-26 17:26:46 EDT
I had accidentally pushed that to master... and this caused the Cancellation test to fail and I have reverted the commit.
Comment 20 Krzysztof Daniel CLA 2013-05-27 07:30:01 EDT
Created attachment 231541 [details]
A Patch with test v2

The patch can be enabled only when the property is set in the config.ini.

I am afraid that given current lack of communication between p2 and SimpleConfigurator I can't provide better solution.
Comment 21 Eclipse Genie CLA 2018-10-09 11:44:14 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 22 Lars Vogel CLA 2019-09-04 01:54:35 EDT
This bug was marked as stalebug a while ago. Marking as worksforme.

If this report is still relevant for the current release, please reopen and remove the stalebug whiteboard tag.
Comment 23 Lars Vogel CLA 2019-09-04 01:57:59 EDT
This bug was marked as stalebug a while ago. Marking as worksforme.

If this report is still relevant for the current release, please reopen and remove the stalebug whiteboard tag.