Bug 416861 - Track simrel participation directly from contribution files
Summary: Track simrel participation directly from contribution files
Status: RESOLVED FIXED
Alias: None
Product: CBI
Classification: Technology
Component: CBI p2 Repository Aggregator (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: CBI Inbox CLA
QA Contact: David Williams CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-09 12:52 EDT by Konstantin Komissarchik CLA
Modified: 2022-06-25 09:23 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin Komissarchik CLA 2013-09-09 12:52:28 EDT
The truth of what's in simrel at any given time is captured by the contribution files. Rather than having a completely separate system for officially tracking participation, we can publish reports based on data in the contribution files.
Comment 1 David Williams CLA 2013-09-09 13:27:44 EDT
This is actually very related to a bug I was supposed to open, but have not gotten around to yet, so we might be able to talk/decide both in this one bug ... but don't mean to hijack it, so if you feel it's a separate enough issue, let me know and I'll open a unique one in cross-project list. 

My bug was going to be about the process where we start each new cycle with a "copy" of the previous cycle, but with all "contributions" disabled ... the purpose of that was to get a positive "affirmation" that the project intended to be in the release. And, people did not like it much since it "broke" aggregation until everyone (or, enough projects) "got on board", etc. 

So, the idea was, with Wayne's proposal, which has other objectives than simply tracking participation (See 
http://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg09766.html

We would have that as a source of "positive affirmation" of participation, so no contributions would be disabled, initially ... and each new cycle would start off a little easier. 

Now, (and here's where my bug starts) ... that only leaves the question of deadlines. My initial thinking was that if someone has not posted by cross-project by the end of M1, THEN their contribution would be disabled (at very beginning of M2). If they had not posted by the end of M3 then their contribution would be removed (at start of M4 work .... since, after all, M4 is deadline even for "new" projects to join the train). 

Sound reasonable? Easier than current "disable all right away" process? 

Back to your original bug :) ... and the reason we ask for positive affirmation at all ... is there have been a few cases in the past, where people have decided not to be "in the next" release train, but did not announce it as they should ... so their contribution stay in until M4 or even M5 or M6 in a few cases before I (or anyone else) noticed ... and their previous contribution continued to build (or, aggregate) just fine ... but, they really did not have the intent to participate. 

So in other words, I think the "contribution files" are not always 100% accurate indication of participation, unless we have positive affirmation ... and Wayne's proposal, in addition to his other reasons, are an improvement on the means to get positive "affirmation" without the "disablement" step which is pretty disruptive. (At least "we" think it is an improvement ... you tell us ... ) 

Is there yet another, better way to achieve all these goals? 

Thanks,
Comment 2 Wayne Beaton CLA 2013-09-09 13:33:51 EDT
I feel that there is a need to connect the various release records and what-not with the participation. Tracking that participation in the PMI, therefore, is pretty natural. I suppose that we could provide links in the aggregation files (or something) but that's starting to feel too complicated.

The aggregation files are for the technical stitching bits together part, and the PMI/release records are for the process part. That seems a lot simpler to me. 

Unless we can think of some way to seamlessly stitch the two together.
Comment 3 Konstantin Komissarchik CLA 2013-09-09 13:39:12 EDT
What is participation?

I tend to think of simrel as a rolling aggregation. Considering that many projects do not align their release dates on simrel dates and simply contribute the latest release that fits with simrel deadlines, I would argue there is no harm in continue to aggregate the last release even if the project has decided to not contribute anything new.

There is only harm if a project's old contribution eventually breaks aggregation, but we already know how to address that.
Comment 4 Wayne Beaton CLA 2013-09-09 13:42:29 EDT
One more data point: not every project that participates in the simultaneous release contributes bits to the repository (so not all projects have an aggregation file).
Comment 5 Wayne Beaton CLA 2013-09-09 13:43:51 EDT
(In reply to Konstantin Komissarchik from comment #3)
> There is only harm if a project's old contribution eventually breaks
> aggregation, but we already know how to address that.

Correct me if I'm wrong, but fixing a break can consume hours of David's time. Anything that we can do to reduce the likelihood of that seems like a good idea to me (or am I totally off track?)
Comment 6 Konstantin Komissarchik CLA 2013-09-09 13:45:43 EDT
> The aggregation files are for the technical stitching bits together part, and
> the PMI/release records are for the process part. That seems a lot simpler to
> me. 

Two distinct places for managing project's participation adds to overhead for all projects and risks one of those places being out of sync. Since we ship what's in simrel repo, the losing party in out of sync questions is always going to be PMI/release records.

I think there are two viable ways to resolve the two places issue:

1. Treat contribution records as the source of truth.
2. Use PMI to manage contribution and discontinue direct editing of contribution files.

#1 is a more approachable solution, while #2 would be a very desirable long term goal.
Comment 7 Konstantin Komissarchik CLA 2013-09-09 13:47:59 EDT
> One more data point: not every project that participates in the simultaneous
> release contributes bits to the repository (so not all projects have an
> aggregation file).

What does that even mean and why is that important?
Comment 8 Konstantin Komissarchik CLA 2013-09-09 13:51:31 EDT
> Correct me if I'm wrong, but fixing a break can consume hours of David's time.

No doubt, but I doubt that the manual opt-in every year saves anyone (including David) time.

The ultimate answer to saving time is automation... I outlined these elsewhere, but I can repeat.

1. Every contribution change should go through gerrit verification. If aggregation validation fails, contribution change is rejected. Then its on contributing party to resolve the issue.

2. Aggregation should mirror all contributed repos prior to aggregation, so that projects can't replace the repo contents and avoid going through contribution process.
Comment 9 Wayne Beaton CLA 2013-09-09 13:52:51 EDT
(In reply to Konstantin Komissarchik from comment #6)
> Two distinct places for managing project's participation adds to overhead
> for all projects and risks one of those places being out of sync. Since we
> ship what's in simrel repo, the losing party in out of sync questions is
> always going to be PMI/release records.

A project's bits are separate from the process documentation today. I assert that continuing to keep these aspects separate is consistent and actually reduces complexity. i.e. you are suggesting that we leak some aspects of process into the technical aspect.

> I think there are two viable ways to resolve the two places issue:
>
> 1. Treat contribution records as the source of truth.

I assume that you mean the b3aggrcon files. Do these track the release name and offset?

Like I said in Comment 4, some of the projects that participate in the simultaneous release do not contribute to the repository.

> 2. Use PMI to manage contribution and discontinue direct editing of
> contribution files.

I'm willing to discuss this, but for now I believe that this is leaking technical aspects into the process aspect.
Comment 10 Wayne Beaton CLA 2013-09-09 13:56:18 EDT
(In reply to Konstantin Komissarchik from comment #7)
> > One more data point: not every project that participates in the simultaneous
> > release contributes bits to the repository (so not all projects have an
> > aggregation file).
> 
> What does that even mean and why is that important?

A project can be a part of the Luna simultaneous release without contributing bits to the composite repository. i.e. no b3aggrcon file.
Comment 11 Konstantin Komissarchik CLA 2013-09-09 13:59:38 EDT
> I'm willing to discuss this, but for now I believe that this is leaking
> technical aspects into the process aspect.

Even assuming that you can draw a clear line between technical and process aspects on this issue, why is maintaining separation important? I don't know how you can argue that maintaining duplicated information in two different places is simpler. You cannot have out-of-sync issues if you aren't maintaining two separate records.
Comment 12 Konstantin Komissarchik CLA 2013-09-09 14:01:46 EDT
> A project can be a part of the Luna simultaneous release without contributing
> bits to the composite repository. i.e. no b3aggrcon file.

How is this different from a project choosing to release on the same date? Or put another way, what would be loose if we more tightly defined simrel around aggregation?

I can only think of marketing reasons, but those can be addressed by including all projects that release on simrel date in marketing materials.

I argue that important bits of simrel process are all about aggregation and EPP.
Comment 13 Konstantin Komissarchik CLA 2013-09-09 14:07:34 EDT
> I assume that you mean the b3aggrcon files. Do these track the release name and
> offset?

Rather than listing the participating release, the report can list the contributed repo URL. This will typically include the version and is far more useful than just the version.

As to the offset, why is this even important to track/publish. Downstream projects should already know when their dependencies deliver their builds or they aren't doing a good job of communicating with their dependencies, and no one else should care about these details. Put another way... What do we do differently based on a project's selection of offset?
Comment 14 Wayne Beaton CLA 2013-09-09 14:08:25 EDT
(In reply to Konstantin Komissarchik from comment #12)
> How is this different from a project choosing to release on the same date?

Not all that different. In fact, we usually have one or two projects that release at the same time as the simultaneous release. IMHO, it's a distinction that a project choose to make on their own.

> Or put another way, what would be loose if we more tightly defined simrel
> around aggregation?

Those projects that do not want/need to join the repository. Hudson is an example that comes to mind. Or Tycho. Or one of the M2M projects that doesn't produce bundles. 

> I can only think of marketing reasons, but those can be addressed by
> including all projects that release on simrel date in marketing materials.

Don't minimize marketing reasons.

> I argue that important bits of simrel process are all about aggregation and
> EPP.

I regard the uber repository as an aspect of the simultaneous release. Process is part of that. Marketing is also part of that. Coordination and communication is a significant part of the simultaneous release.
Comment 15 Konstantin Komissarchik CLA 2013-09-09 14:15:34 EDT
> Don't minimize marketing reasons.

I am not, but why should we burden simrel process with extra work if we can address marketing concerns simply by including all projects that released on that date... or even better, include all projects who have shipped a release since the last marketing blast.
Comment 16 Wayne Beaton CLA 2013-09-09 14:50:06 EDT
(In reply to Konstantin Komissarchik from comment #15)
> > Don't minimize marketing reasons.
> 
> I am not, but why should we burden simrel process with extra work if we can

How much "extra work" do you think we're talking about?

To do a release, a project is already required to create the release record and plan, and then engage in a release review. This has nothing to do with the simultaneous release.

You need to join the cross-project-issues-dev mailing list if you're participating in the simultaneous release. You have to open and transparent on that list.

To join the simultaneous release, all you have to do is tell cross-project-issues-dev about the release record that you had to create anyway, and specify an offset.

I know that little things add up, but I'm having trouble describing this addition as particularly onerous.

> address marketing concerns simply by including all projects that released on
> that date... or even better, include all projects who have shipped a release
> since the last marketing blast.

I used to do this. It mostly works. I started by finding releases that happened to occur within a few days of the official release date. This got most of them. Then projects like Jetty and EclipseLink started included releases that occurred well in advance of the big date. Sorting it out using convention just doesn't work.
Comment 17 Konstantin Komissarchik CLA 2013-09-09 16:39:45 EDT
> I know that little things add up, but I'm having trouble describing this
> addition as particularly onerous.

Yes, it's not a major item, but the more we can streamline and simplify stuff like this, the more time projects can devote to building value in community and in code.

> > address marketing concerns simply by including all projects that 
> > released on that date... or even better, include all projects who have 
> > shipped a release since the last marketing blast.
> 
> I used to do this. It mostly works. I started by finding releases that happened
> to occur within a few days of the official release date. This got most of them.
> Then projects like Jetty and EclipseLink started included releases that occurred
> well in advance of the big date. Sorting it out using convention just doesn't 
> work.

I agree that querying for released on a fixed date is unlikely to be effective. More and more projects are choosing to break away from the rigid yearly releases and the process needs to adapt accordingly. Querying for releases made since the last announcement should do the trick, with the added advantage that it would benefit all projects, not just those participating in simrel.
Comment 18 David Williams CLA 2013-09-09 19:48:22 EDT
(In reply to Wayne Beaton from comment #5)
> (In reply to Konstantin Komissarchik from comment #3)
> > There is only harm if a project's old contribution eventually breaks
> > aggregation, but we already know how to address that.
> 
> Correct me if I'm wrong, but fixing a break can consume hours of David's
> time. Anything that we can do to reduce the likelihood of that seems like a
> good idea to me (or am I totally off track?)

Minor comments: each "break" is different, some take hours, some are fixed by the time I notice ... but, the problem with having "old contributions" in repo aggregation, when the project is not intending to participating, it that is gives the community the wrong impression ... they download a "milestone", they look for other things to install, see "project Y" and assume that will be part of the eventual released repo (especially when it gets up to M4, M5, etc.) ... but, it won't be. So becomes a "last minute" surprise that might effect adopters/users plans. 

And, this is not technically just "users and adopters" ... though that's my main concern about being sure aggregation files are accurate ... every contribution "in" the sim. rel. repo can in theory effect how other contributions "work" ... a simple example (that happens more often than you'd know) is that project X might "accidentally", "unknowingly" be picking up a dependency from project Y ... so if project Y is not supposed to be there ... the earlier that's Y is no longer in repo, the earlier the "missing dependency" will be discovered, and corrective action can be taken.
Comment 19 Konstantin Komissarchik CLA 2013-09-10 11:45:21 EDT
There is no doubt that projects withdrawing from aggregation is disruptive, but requiring all projects to re-opt-in at the start of the release does relatively little to mitigate that risk as there is nothing preventing a project from initially opting in and later dropping out as deadlines mount.

When you balance the overhead of this process and scramble it causes in early milestones against the occasional benefit and incomplete management of the stated risk, I think it would be better overall to discontinue this policy.
Comment 20 Wayne Beaton CLA 2013-09-10 13:00:13 EDT
(In reply to Konstantin Komissarchik from comment #19)
> There is no doubt that projects withdrawing from aggregation is disruptive,
> but requiring all projects to re-opt-in at the start of the release does
> relatively little to mitigate that risk as there is nothing preventing a
> project from initially opting in and later dropping out as deadlines mount.
> 
> When you balance the overhead of this process and scramble it causes in
> early milestones against the occasional benefit and incomplete management of
> the stated risk, I think it would be better overall to discontinue this
> policy.

I think that you're overstating the scramble.

A representative of a participate projects should already have a copy of the simrel repository. They need to pull, modify their file, commit, and push. I'd expect that many projects would need to update elements of their b3aggrcon file anyway; if not, then they really should be reviewing it for correctness. Then they need to assemble their release record/plan, and send an email.

The only part that I think that we can reasonably describe as onerous is the creation of a plan. All projects are required by the EDP to assemble a plan at the beginning of every release cycle. The only reason why this might be considered onerous at this point is that we didn't have a reasonably good way of keeping track of whether it was done or not in the past, and now we do. i.e. you used to be able to get away without doing it.

Regardless, early explicit opt-in has several benefits. First, it ensures that all of the participants are actually on board and mentally prepared to get started. Second, it gives everybody who is watching confidence that the projects that they depend on are mentally prepared. I expect that this will reduce the incidents of last minute scramble to get on board before M4, and am hopeful that there will be fewer late arrivals.
Comment 21 Konstantin Komissarchik CLA 2013-09-10 13:25:44 EDT
It is hard to overstate the mess that we've had at the start of Luna. It is not simply a matter of tweaking your contribution file to start participating when your project dependencies have been disabled and their maintainers are on vacation. We almost didn't have anything to deliver for M1. What exactly was the benefit of that? For most projects, their Kepler contribution would have still worked and allowed everyone's contributions to be incrementally updated.

Many parts of the simrel process, including the opt-in process and disabling all contributions at the start presumes the model where all projects align the releases with simrel. Since that's less and less the path that is chosen by participating projects, the process needs to adapt accordingly.

Many projects will go through multiple releases in the span of a single simrel cycle. Some will not produce a new release. The idea that everyone is defining and setting up their release during M1 or M2 does not reflect the reality.
Comment 22 Wayne Beaton CLA 2022-01-05 17:25:45 EST
Ed, I recall that you're currently working on something related to this. Should we leave this issue open, or are you tracking the work somewhere else?
Comment 23 Ed Merks CLA 2022-01-05 23:49:31 EST
(In reply to Wayne Beaton from comment #22)
> Ed, I recall that you're currently working on something related to this.
> Should we leave this issue open, or are you tracking the work somewhere else?

Perhaps we should move this issue to

CBI -> CBI p2 Repository Aggregator

The work I've been doing on tracking project (aggrcon) dependencies is being targeted there.  I have no specific bugs for tracking that work yet...
Comment 24 Ed Merks CLA 2022-06-25 09:23:12 EDT
I've implements this type of support as part of this:

https://github.com/eclipse-cbi/p2repo-aggregator/issues/3