Community
Participate
Working Groups
In the arch call today, I mentioned the testing that DJ and I were conducting to run the build from git as described in bug 344152. John remarked that the granularity of the Eclipse/Equinox git repos (currently just mirrored, not migrated) might need to be revisited before the actual migration. This bug is to discuss and decide on the granularity of the Eclipse and Equinox repositories before we migrate. Today they are divided into the following repos - platform, pde, jdt, equinox, p2 as you can see on this link. http://dev.eclipse.org/git/index.html
I think the answer is one repository per component (sub-project). I.e., each unique committer group would have one repo. So we would have repos like "SWT", "Platform UI", "Platform Resources", "JDT Core", "JDT UI", etc. This is the minimum number of repositories we can possibly have, since multiple committer groups within a single git repository is not feasible.
Following the groups seems to leading to too fine of a granularity to me. I think ppl we are making committers are reasonable enough that they will not go and change the source code they don't know / own. I would have gone for coarser repo like JDT (including all JDT related things), PDE. For the platform itself it is not clear how to split it.
I am with John, at least for the Platform. I wouldn't want to have to clone all of Platform Text, Ant, Debug, Resources, SWT, etc. if I am working on Platform UI. Also, the "one unix-group, one Git repository" is a simple enough rule of thumb that it could actually work.
(In reply to comment #3) > I am with John, at least for the Platform. I wouldn't want to have to clone all > of Platform Text, Ant, Debug, Resources, SWT, etc. if I am working on Platform > UI. > > Also, the "one unix-group, one Git repository" is a simple enough rule of thumb > that it could actually work. You wouldn't necessarly have to clone those others, if they had git repos, and you could install the necessary bits necessary. So, I'd be for Pascal's approach, so something like org.eclipse.jdt.git org.eclipse.pde.git
(In reply to comment #4) > (In reply to comment #3) > > I am with John, at least for the Platform. I wouldn't want to have to clone all > > of Platform Text, Ant, Debug, Resources, SWT, etc. if I am working on Platform > > UI. > > You wouldn't necessarly have to clone those others, if they had git repos, and > you could install the necessary bits necessary. If all of Eclipse Platform was in a single Git repository, a Git clone operation would give me all these other components in source form, including their history. This is way too much, I stand by my opinion that the granularity should not be coarser than the Eclipse project structure and its associated Unix groups.
(In reply to comment #3) > > Also, the "one unix-group, one Git repository" is a simple enough rule of thumb > that it could actually work. This also makes perfect sense to me, at least for the Eclipse Project. PW
(In reply to comment #6) > (In reply to comment #3) > > > > Also, the "one unix-group, one Git repository" is a simple enough rule of thumb > > that it could actually work. > > This also makes perfect sense to me, at least for the Eclipse Project. > > PW +1 I think this the most simple way forward. Question, if we decide to go more fine/coarse grain in the future how hard is it to change later? Is this something we can work through and change during the Juno release, but after that we are pretty much set in stone?
(In reply to comment #7) > +1 I think this the most simple way forward. Question, if we decide to go more > fine/coarse grain in the future how hard is it to change later? Is this > something we can work through and change during the Juno release, but after > that we are pretty much set in stone? Depends on what you mean by "hard" :-) With git's ability to re-write history, you can stitch 2 repos together, even changing their directory location within the repo [1] while keeping the commit information generally intact. But if you need to reproduce Juno builds, you wouldn't be able to move them very far, would you? Or conversely, you would have to leave an abandoned "big" repo to rebuild parts of Juno *and* move the history with you to the smaller ones so you could find out stuff. [1] you can take repo1/proj1 and repo2/proj2 and create bigRepo/bundles/proj1,proj2 PW
I think the decision we make here has an impact on the solution to bug345670. If we have more than one project per repo how will Eclipse-BundleSource headers work? As I understand it you can only clone complete git repositories. So if a user of PDE imports from repo a single bundle at some specific version then what has to happen? - The complete repo where that bundle lives has to be cloned and then the single project from the cloned repo needs to be imported into the workspace. - Now lets imagine the user selects another bundle which exists in the same repository, but it its tagged version is not available in the previously cloned repo. Would we now need to create another clone of the repo from the necessary commit tag and then import that project? I'm probably missing something in git. There probably is a good way to tag things so that this works nicely?
(In reply to comment #9) > I think the decision we make here has an impact on the solution to bug345670. > If we have more than one project per repo how will Eclipse-BundleSource headers > work? Yes and no... The Eclipse-BundleSource header for git would need to work for any kind of git repository layout. The solution to that problem shouldn't be tailored to the particular repository layout used by the Eclipse & Equinox projects (this bug). I don't think we should be constraining our project layout to simplify the implementation of Eclipse-BundleSource for Git. > Would we now need to create another clone of the repo from the necessary > commit tag and then import that project? Projects are imported from a Git clone's working copy. Typically there is only one working copy per clone, and that working copy contains the contents of a single branch/tag. So yes, we would need a separate clone per distinct branch. So this would be somewhat expensive, but I suspect it's also a rare case. For example a case where a user wants one bundle from 3.7 and another bundle from 3.6 in their workspace at the same time. I think we should continue the discussion of how to implement Eclipse-BundleSource for Git in bug 348040.
(In reply to comment #10) > I think we should continue the > discussion of how to implement Eclipse-BundleSource for Git in bug 348040. I meant bug 345670. Hopefully Orion search indexer performance is a totally unrelated problem ;)
(In reply to comment #3) > I am with John, at least for the Platform. I wouldn't want to have to clone all > of Platform Text, Ant, Debug, Resources, SWT, etc. if I am working on Platform > UI. > > Also, the "one unix-group, one Git repository" is a simple enough rule of thumb > that it could actually work. +1. I don't want to clone everything if I am just working on JDT/UI.
+1 to have one Git repo per ACL (Unix group).
Note we will also need another repository for common things: map files and documentation. I think the easiest solution is for the platform, JDT, and PDE doc to all be in this single repository with an appropriate access control list so all committers can write to it.
Adding Wayne to this discussion. Seems like this is the sort of thing that has/will come up in other projects and he might have some perspective. Similarly, I see Denis is already on the bug. Any feed back on this from the Webmaster point of view? For my vote? I like the "simplest possible" approach of one repo per ACL as that does indeed seem simple. I wonder what others are doing.
(In reply to comment #15) > Adding Wayne to this discussion. Seems like this is the sort of thing that > has/will come up in other projects and he might have some perspective. FWIW, I've been monitoring it (I even Tweeted about it). Like many other things, it's a balancing act. I've seen projects set up a single repository for each bundle. That's probably too extreme. Dividing it up along ACL lines is probably as granular as I'd like to see (essentially one repository per Eclipse Project). Even at that level of granularity, I suspect that some of the the repository clones will still be huge. It may be worth experimenting to see how huge before you make a decision. You may consider further dividing by functional areas or something, e.g. subsets that people working in particular functional areas need to have.
(In reply to comment #16) > Even at that level of granularity, I suspect > that some of the the repository clones will still be huge. > > It may be worth experimenting to see how huge before you make a decision. Some of our repositories are indeed huge. The Eclipse TLP CVS repository is 15GB. One thing that really bloats our CVS repository is our current practice of checking compiled code into our repository in several cases (compiled native libraries, base builder). We are looking at using a p2 repository for binary artifacts going forward, and omitting all binaries during our CVS->Git export. This should greatly help with keeping the size down. It will still be interesting to see the Git repository sizes before we make any final decision. Kim just wanted to get some consensus beforehand, because the migration step is going to be quite complicated and we don't want to change our minds half way through if we can avoid it!
(In reply to comment #17) > (In reply to comment #16) > > Even at that level of granularity, I suspect > > that some of the the repository clones will still be huge. > > > > It may be worth experimenting to see how huge before you make a decision. > > Some of our repositories are indeed huge. The Eclipse TLP CVS repository is > 15GB. One thing that really bloats our CVS repository is our current practice > of checking compiled code into our repository in several cases (compiled native > libraries, base builder). We are looking at using a p2 repository for binary > artifacts going forward, and omitting all binaries during our CVS->Git export. > This should greatly help with keeping the size down. It will still be > interesting to see the Git repository sizes before we make any final decision. > > Kim just wanted to get some consensus beforehand, because the migration step is > going to be quite complicated and we don't want to change our minds half way > through if we can avoid it! +1 for stopping the practice of checking in binaries into the source repository. There are better ways, and p2 repos would be the recommendation, especially if you are not going to use maven and maven.eclipse.org to share artifacts.
> Similarly, I see Denis is already on the bug. Any feed back on this from the > Webmaster point of view? Thanks. For sure, one repo per unix group cuts down on administrivia and complexity. The fewer extended ACLs we create, the easier it is on everyone. (In reply to comment #16) > repository per Eclipse Project). Even at that level of granularity, I suspect > that some of the the repository clones will still be huge. The git mirrors can provide early clues... Projects like AJDT have a long history, and they are correspondingly quite big. FWIW, I ran an aggressive compaction on the git mirror repos just yesterday. 21M org.eclipse.actf 430M org.eclipse.ajdt 500K org.eclipse.albireo 3.2M org.eclipse.amalgam 40M org.eclipse.amp 9.8M org.eclipse.ant 72K org.eclipse.apogee 9.0M org.eclipse.atf 1.6M org.eclipse.babel 235M org.eclipse.birt 84K org.eclipse.blinki 4.7M org.eclipse.bpel 274M org.eclipse.cdt 2.9M org.eclipse.cloudfree 15M org.eclipse.cobol 3.4M org.eclipse.compare 24M org.eclipse.core 2.2M org.eclipse.corona 316M org.eclipse.cosmos 136K org.eclipse.cvs 2.0G org.eclipse.dash 50M org.eclipse.datatools 7.8M org.eclipse.dd 14M org.eclipse.debug 56M org.eclipse.dltk 39M org.eclipse.e4 55M org.eclipse.ecf 81K org.eclipse.edt 81K org.eclipse.egl 276M org.eclipse.emf 124M org.eclipse.epf 100M org.eclipse.epp 863M org.eclipse.equinox 68M org.eclipse.equinox.p2 127M org.eclipse.ercp 16M org.eclipse.esl 30M org.eclipse.examples 973K org.eclipse.fproj 22M org.eclipse.gef 54M org.eclipse.gmf 8.0M org.eclipse.gmp 2.0M org.eclipse.gmt 17M org.eclipse.gyrex 9.8M org.eclipse.help 6.9M org.eclipse.hibachi 72M org.eclipse.higgins 225M org.eclipse.hyades 1.6M org.eclipse.ide4edu 247M org.eclipse.jdt 13M org.eclipse.jface 376K org.eclipse.jsch 23M org.eclipse.jwt 3.0M org.eclipse.ltk 40M org.eclipse.m2m 42M org.eclipse.m2t 2.7M org.eclipse.maynstall 93M org.eclipse.mdt 27M org.eclipse.mtj 89M org.eclipse.mylyn 4.7M org.eclipse.nab 276K org.eclipse.ofmp 465M org.eclipse.ohf 213M org.eclipse.orbit 15M org.eclipse.osgi 89M org.eclipse.pde 37M org.eclipse.pdt 72K org.eclipse.pdtincubato 71M org.eclipse.phoenix 121M org.eclipse.platform 6.4M org.eclipse.pmf 246M org.eclipse.ptp 67M org.eclipse.rap 1.5G org.eclipse.releng 81K org.eclipse.remus 25M org.eclipse.riena 81K org.eclipse.sapphire 1.5M org.eclipse.scripting 55M org.eclipse.sdk 2.5M org.eclipse.search 344K org.eclipse.soc 266M org.eclipse.swt 24M org.eclipse.team 2.3M org.eclipse.test 1.3M org.eclipse.text 27M org.eclipse.tigerstripe 24M org.eclipse.tm 107M org.eclipse.tmf 3.3M org.eclipse.tml 7.4M org.eclipse.tomcat 141M org.eclipse.tptp 1.8M org.eclipse.ua 71M org.eclipse.ui 464K org.eclipse.uml2 9.9M org.eclipse.update 2.4M org.eclipse.vcm 43M org.eclipse.ve 1.4M org.eclipse.webdav 403M org.eclipse.webtools
I looked at the projects in our existing CVS repos and sorted them by unix group to create a first draft of how our git repos might be organized. http://wiki.eclipse.org/Platform-releng/Git_Migration_Granularity
(In reply to comment #20) > I looked at the projects in our existing CVS repos and sorted them by unix > group to create a first draft of how our git repos might be organized. > > http://wiki.eclipse.org/Platform-releng/Git_Migration_Granularity Under Framework I see this: (from compendium) rt.equinox.framework org.eclipse.osgi.services rt.equinox.framework org.eclipse.osgi.util At first I thought this was a mistake because I had always thought these bundles were under the rt.equinox.bundles committer group and would have gone into the Equinox Bundles git repo. But it appears this is not the case. So I am just confirming that I think this is fine and we can include the above bundles in the Equinox Framework git repo along with the other projects in the rt.equinox.framework committer group.
Last night, I looked at the list and decided that this would be the initial list of repositories that we required /gitroot/jdt/eclipse.jdt.core.git /gitroot/jdt/eclipse.jdt.debug.git /gitroot/jdt/eclipse.jdt.ui.git /gitroot/jdt/eclipse.jdt.git /gitroot/platform/eclipse.platform.git /gitroot/platform/eclipse.platform.debug.git /gitroot/platform/eclipse.platform.releng.git /gitroot/platform/eclipse.platform.resources.git /gitroot/platform/eclipse.platform.runtime.git /gitroot/platform/eclipse.platform.swt.git /gitroot/platform/eclipse.platform.team.git /gitroot/platform/eclipse.platform.text.git /gitroot/platform/eclipse.platform.ua.git /gitroot/platform/eclipse.platform.ui.git /gitroot/pde/eclipse.pde.git /gitroot/pde/eclipse.pde.build.git /gitroot/pde/eclipse.pde.ui.git /gitroot/pde/eclipse.pde.incubator.git /gitroot/equinox/rt.equinox.bundles.git /gitroot/equinox/rt.equinox.framework.git /gitroot/equinox/rt.equinox.p2.git /gitroot/equinox/rt.equinox.incubator.git /gitroot/equinox/rt.equinox.security.git Paul, John and I had some hallway conversations about this. Paul had some concerns about the one repo per unix group approach so I'll let him update this bug with his proposal.
(In reply to comment #22) > Paul, John and I had some hallway conversations about this. Paul had some > concerns about the one repo per unix group approach so I'll let him update this > bug with his proposal. Finer-grained repos make sense to me. We cannot, however, split a repo across UNIX groups.
I fully support encapsulated any give repo within one unix group :-) In looking through eclipse.platform.ui, it seems we have more than one buildable unit. 1. jface, core commands, and databinding 2. workbench and ui (rest of RCP) 3. ide and ide support (application, etc, based on core.resources) 4. Eclipse 4 stuff, which depends on EMF I can put them all into 1 repo, eclipse.platform.ui.git (my initial tests place the .git repo at about 85M), but it might make sense to put the projects in 3 repos: 1) jface+commands+databinding 2) workbench+ide 3) Eclipse 4 Managing this by unix group could lead to: #Option 1 /gitroot/platform/eclipse.platform.ui/org.eclipse.jface.git /gitroot/platform/eclipse.platform.ui/org.eclipse.ui.git /gitroot/platform/eclipse.platform.ui/org.eclipse.eclipse4.git <- named still TBD or a unix-group like container at the gitroot: #Option 2 /gitroot/eclipse.platform.ui/org.eclipse.jface.git /gitroot/eclipse.platform.ui/org.eclipse.ui.git /gitroot/eclipse.platform.ui/org.eclipse.eclipse4.git <- named still TBD This could still fit into the proposal in comment #22 as: #Option 3 /gitroot/platform/org.eclipse.jface.git /gitroot/platform/org.eclipse.ui.git /gitroot/platform/org.eclipse.eclipse4.git <- named still TBD The difference between Option 1&2 and Option 3 is that growing new repos in option 1&2 can be done by the developers, similar to how we already manage other git repos (like /gitroot/e4). In option 3 all 3 repos had to be created by a webmaster so they have the correct unix group permission. If we need another repo, we'll have to submit a bug (I'm not saying it's a bad thing, but different from how we manage e4 for example). PW
(In reply to comment #24) > I can put them all into 1 repo, eclipse.platform.ui.git (my initial tests place > the .git repo at about 85M), but it might make sense to put the projects in 3 > repos: > > 1) jface+commands+databinding > 2) workbench+ide > 3) Eclipse 4 The advantage of more repositories is that someone who knows they only want to work on a subset can checkout less stuff. On the other hand, my fairly short experience is that working with multiple git repositories can be a real pain. For example if you have a change that spans multiple repositories you have to do the branch/fetch/merge/commit/push dance for each repository separately. You don't have a single atomic commit that can be merged/cherry-picked across remotes in a single step, etc. In Orion we have two repositories, but after six months I kinda wished we had only created one because of all the extra workflow steps introduced by having two repositories. Maybe this is one of the reasons other big projects like the Linux kernel use a single Git repository. Considering that someone working on Platform UI might also need to clone SWT, Runtime, Equinox, and possibly others, we should try to avoid increasing the number of clones unnecessarily. In the end it is the Platform UI committers that will feel this pain though, so we can do whatever you guys want. Maybe bring it up at your next Platform UI planning call?
(In reply to comment #25) > On the other hand, my fairly short > experience is that working with multiple git repositories can be a real pain. Agreed. I've spent a couple years with in CDT clone at the project level, and I ended up manipulating them using some crafted bash to run cgit over all of them at once. It's much easier with fewer repos... The other thing to bear in mind is that if you decide at a later date that you want to split out some content from the main repository, this can be easily done in git. It's more painful to recombine a number of repos. 85M doesn't sound too bad -- this is the started size of the CDT repo too.
(In reply to comment #25) > Considering that someone working on Platform UI might also need to clone SWT, > Runtime, Equinox, and possibly others, we should try to avoid increasing the > number of clones unnecessarily. OK, that's a fairly convincing argument as well :-) I'll bring the discussion up at our Platform UI call, but now I'm leaning towards one eclipse.platform.ui.git repo. PW
I opened bug 349891 to create the directories for the git repos for Eclipse and Equinox so we can do a full test build. We can sort out the platform ui repo after they discuss it in their planning call.
(In reply to comment #28) > I opened bug 349891 to create the directories for the git repos for Eclipse and > Equinox so we can do a full test build. We can sort out the platform ui repo > after they discuss it in their planning call. Platform UI will just follow the convention: /gitroot/platform/eclipse.platform.ui.git PW
I think this bug can be closed. We're making progress with the git migration with eight of 25 git repos transitioned so far.