[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [eclipse.org-committers] Installing SVN

Hi Karl,
Here are some views to balance things up a bit :-). It's mostly from a users perspective, but not all.


Karl Matthias wrote:
Using a monolithic database system rather than the filesystem (you know, the thing everyone else uses to store files) mean we can't leverage the granular controls we already have in place (Unix groups) to control access to repositories.
You win some and you loose some. SVN has atomic commits. It might seem like a trivial feature for those who rarely experience network problems but for us who do on a somewhat regular basis, it's very valuable. SVN can also revision directories as well as files. Very hard to do if you're using a file system. SVN doesn't loose track of history just because you move or rename a file. All of those features are contributed to the fact that SVN uses a database and hence, isn't subject to all the limitations you have in a file system. Revisioned data is after all not files. It is fragmented pieces of information. The CVS model is IMHO severely limited.

I know, non of that has any bearing on server administration. It does have relevance to a lot of users though.

We could use some SVN tooling to implement somewhat tighter controls, but it would be a complete duplication of a system we already maintain.
My guess is that you have invested a lot of time over the years in this tooling but since CVS has always been the foundation, the tooling doesn't have the abstractions needed to slot in something else. Now SVN comes in. It does things differently so of course the efforts needs to be duplicated and that's hard work. But SVN is not to blame for that.

This is the smaller of the two problems. The larger one is that when something becomes corrupted (happens too often, unfortunately), you lose the whole repository. Then we have to restore the whole repository either from a nightly mirror (hopefully), or from tape, which can be very slow. Committers are then forced to re-commit any changes that happened that day. That's harder than you would think because the client thinks it's in sync already, which means you have to check the code out from the restored repository, manually copy the files over and re-commit. The repository is down during this whole process.
Really? I've been using SVN at Eclipse since the beginning. I've never experienced a crash. How often does that happen and why have I never been affected by it? Are you that good at fixing and hiding things? No sarkasm intended here mind you, I really do think you do a great job maintining the repos.

What backend are you using? The Berkley based or the file based repo?

With CVS you can restore files on an individual basis and one corrupted file (almost never happens) rarely causes a problem with anything else. Restores take seconds because you don't have to load megs of data from tape. Copying the repositories is slow because the fact that the files are very large makes rsync less helpful since it diffs whole files. It wastes a lot of time and local bandwidth.

I would contribute that partly to how the repositories are organized at Eclipse.org. Why not let each (leaf) project have their own SVN repository? That way you'd get much smaller units to manage.

And, with SVN, old revisions are always kept. This might be good in some ways, but mostly it's problematic. The effect is that when someone checks in something that violates IP cleanliness rules or that was just plain wrong, we have to dump the whole repository, filter the text file with the SVN tools, and reload it. This is slow and error prone. The repository is unusable while we do that, and something fairly often breaks in the tooling while doing this (ask the Technology people), with the not unlikely possibility of toasting the repository and having to try again.
If this happens frequently it can be a problem of course. Especially if the repositories are very large. This too is a valid argument for having a more fine grained organization of the repositories. Less people get affected when you do it and it finishes a lot quicker.


A third and more minor, but still very annoying issue, is that SVN does not log LOC counts for commits. That means we can't track that for project Dash without doing a diff of every change and counting the lines of code. That's an absurd use of resources, so we don't do it.
I doubt you need to do that on every commit. The way I've understood it you can do this on some regular basis (weekly perhaps?). If its run on the same machine as the server, it shouldn't be that resource demanding.


Now, I don't do large branches and merges, so I can't speak to that from a user perspective, but in case you want to know how it feels to admin SVN, there you have it. :) Thanks for listening and cheers


I work frequently with branches. I never managed to be friend with the CVS way of doing that. SVN's model is simple, intuitive, and very resource efficient.

I frequently refactor my java packages, move files around, and rename them. CVS just cannot handle that at all. It hence becomes a limiting factor for code-refactoring, something that has hit me in several projects in the past.

I am, from time to time forced to work on CVS code bases. Luckely it's often very mature code where package or class renaming is very uncommon. God forbid that I would be forced to use CVS on newer projects. That would be a major step back for me. That said, I must admit that the support for CVS in the Eclipse IDE is awesome. It's actually a lot better then the limited and outdated back-end deserves.

The one and only problem that I have with SVN right now is that Eclipse.org is refusing to redistribute the client due to IP issues. The client is LGPL'ed but the lawyers apparently complain about that too since it can be reverse engineered. That's at a level that I cannot comprehend. Why would you reverse engineer a binary when the source is free? But that's another discussion for another forum.

Regards,
Thomas Hallgren