RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN

From: "Paul Trevithick" <paul@xxxxxxxxxxxxxxxxx>
Date: Tue, 20 Nov 2007 19:35:42 -0500
Delivered-to: higgins-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/listinfo/higgins-dev>
List-help: <mailto:higgins-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/higgins-dev>, <mailto:higgins-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/higgins-dev>, <mailto:higgins-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: AcgrnPVVJH9E2ApUQr2vxkX87LFL4wAON+aw

Sounds good Andy.

Any objections to Mary or I asking Matt Ward to do the conversion at his convenience starting sometime after 5pm Eastern?

Going once…

From: higgins-dev-bounces@xxxxxxxxxxx [mailto:higgins-dev-bounces@xxxxxxxxxxx] On Behalf Of Andrew Hodgkinson
Sent: Tuesday, November 20, 2007 10:58 AM
To: 'Higgins (Trust Framework) Project developer discussions'
Cc: Webmaster(Matt Ward)
Subject: RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN

Paul,

I think this would be a good week to make the switch to SVN and I can perform the conversion again if Matt (Eclipse webmaster) will create a new snapshot of the CVS repository. Since we don't have filesystem access to the CVS repository, there is no way to do the conversion without involving an Eclipse webmaster.

In the e-mail exchanges with Matt, it sounded like he (or someone on his team) would do the conversion as soon as we gave him the go-ahead. It is actually more work for him to create the CVS snapshot, upload it, and then download and import the resulting SVN dump file than it is to run the script directly against the repository. Either way, I am willing to help make this happen.

Matt will need you or Mary to authorize the creation of a new dump file and/or the SVN conversion. If there are no objections to this plan, let's set the CVS check-in cut-off time to 5 PM EST on Wednesday. I've CC'd Matt on this e-mail (Matt -- What would work best for you?)

Thanks,

Andy

>>> "Paul Trevithick" <paul@xxxxxxxxxxxxxxxxx> 11/19/07 8:21 PM >>>

Here’s a suggestion. We all check in everything this Wednesday and then wait for Andy to perform the conversion and then tell us when it is once again safe to go into the water.

From:

higgins-dev-bounces@xxxxxxxxxxx [mailto:higgins-dev-bounces@xxxxxxxxxxx]

On Behalf Of

Jim Sermersheim

Sent:

Monday, November 19, 2007 7:37 PM

To:

'Higgins (Trust Framework) Project developer discussions'

Subject:

Re: [higgins-dev] Research on how Higgins could convert fromCVS to SVN

Does anyone have reservations about making the switch? If not, what are the next steps? We'll need to agree on a time that everyone stops using CVS for an hour or so.

>>> "Jim Sermersheim" <jimse@xxxxxxxxxx> 11/15/07 2:53 PM >>>

So, I tried importing from this repository into Eclipse (using subclipse) and it works perfectly. File histories are preserved and version compares work. I pulled down IdAS and all dependency projects, ran my handy AWK script to link all the dependency libs to my common deps folder, and all the projects built without error.

I didn't have to enter the pw twice from within eclipse

Jim

>>> "Andrew Hodgkinson" <ahodgkinson@xxxxxxxxxx> 11/15/07 1:36 PM >>>

All,

The trial conversion of the Higgins repository (296 MB) from CVS to SVN completed without any errors. The total time to create an SVN dump file (that can be subsequently imported into SVN using 'svnadmin load') required a little over 4 minutes to complete. The cvs2svn statistics are somewhat interesting:

cvs2svn Statistics:

------------------

Total CVS Files: 9880

Total CVS Revisions: 25451

Total CVS Branches: 5362

Total CVS Tags: 33151

Total Unique Tags: 27

Total Unique Branches: 6

CVS Repos Size in KB: 239215

Total SVN Commits: 2245

First Revision Date: Wed Oct 12 13:21:57 2005

Last Revision Date: Thu Nov 15 08:39:48 2007

------------------

Timings (seconds):

------------------

80.9 pass1 CollectRevsPass

0.0 pass2 CollateSymbolsPass

14.6 pass3 FilterSymbolsPass

0.1 pass4 SortRevisionSummaryPass

0.6 pass5 SortSymbolSummaryPass

14.3 pass6 InitializeChangesetsPass

6.6 pass7 BreakRevisionChangesetCyclesPass

6.6 pass8 RevisionTopologicalSortPass

3.5 pass9 BreakSymbolChangesetCyclesPass

7.1 pass10 BreakAllChangesetCyclesPass

11.2 pass11 TopologicalSortPass

6.6 pass12 CreateRevsPass

0.2 pass13 SortSymbolsPass

0.2 pass14 IndexSymbolsPass

99.3 pass15 OutputPass

252.1 total

real 4m12.295s

user 2m59.771s

sys 0m11.901s

Importing the dump file to create the SVN repository required approximately 21 minutes on my server. If you would like to check out a working copy to verify that your component was migrated correctly, I've put a copy of the repository on cards.bandit-project.org. Assuming SVN is installed on your client, you can check out a copy using the following command:

svn checkout

svn+ssh://higgins@xxxxxxxxxxxxxxxxxxxxxxxx/home/higgins/svn/org.eclipse.higgins/trunk

destdir

The password is higgins$test. You can also pull down a copy that includes all of the branches by using the following command:

svn checkout

svn+ssh://higgins@xxxxxxxxxxxxxxxxxxxxxxxx/home/higgins/svn/org.eclipse.higgins

destdir

Note that you will be required to enter the password twice. This is a quirk of using the svn+ssh scheme.

Thanks,

Andy

>>> "Daniel Sanders" <dsanders@xxxxxxxxxx> 11/14/07 1:50 PM >>>

I don't tend to think that the single versioning thing is a big issue. What it means is that for a given sub-directory tree (which might represent a project), the version history might include changes for version numbers 2, 3, 5, 8, 10, 16, and 30, but not for any of the other numbers. As long as you can see what version numbers actually apply to a particular sub-directory tree (and you can) I don't know why that version number list necessarily has to be perfectly monotonic. It is also easy to find out the highest version number where a change actually occurred in a sub-directory tree. Are there other specific concerns you have about this? or limitations you perceive? BTW, any given file in an SVN repository will also have a sparsely populated version history - so even if you do have an SVN repository per project, the version history for any object in the repository will still be sparse. If we accept that we are going to have sparse version histories for single files, why would it matter that the same is true for sub-directories that represent a particular project?

>>> "Jim Sermersheim" <jimse@xxxxxxxxxx> 11/14/2007 12:34 PM >>>

With Directory versioning or Versioned metadata, is it possible to have a repository full of a number of different "projects (each in their own directory) where each project (subdirectory tree) can have it's own versioning? All the repositories I've used have a single version for the repository.

>>> "Andrew Hodgkinson" <ahodgkinson@xxxxxxxxxx> 11/14/07 11:18 AM >>>

I'm in the process of setting up a local Subversion repository so that I can attempt a trial run of the CVS conversion script against the Higgins repository. I'll report the results on tomorrow's conference call. In the mean time, for those of you who aren't familiar with Subversion, I found the following section from "Version Control with Subversion" very informative:

When discussing the features that Subversion brings to the version control table, it is often helpful to speak of them in terms of how they improve upon CVS's design. Subversion provides:

Directory versioning

CVS only tracks the history of individual files, but Subversion implements a "virtual" versioned filesystem that tracks changes to whole directory trees over time. Files and directories are versioned.

True version history

Since CVS is limited to file versioning, operations such as copies and renames-which might happen to files, but which are really changes to the contents of some containing directory-aren't supported in CVS. Additionally, in CVS you cannot replace a versioned file with some new thing of the same name without the new item inheriting the history of the old-perhaps completely unrelated-file. With Subversion, you can add, delete, copy, and rename both files and directories. And every newly added file begins with a fresh, clean history all its own.

Atomic commits

A collection of modifications either goes into the repository completely, or not at all. This allows developers to construct and commit changes as logical chunks, and prevents problems that can occur when only a portion of a set of changes is successfully sent to the repository.

Versioned metadata

Each file and directory has a set of properties-keys and their values-associated with it. You can create and store any arbitrary key/value pairs you wish. Properties are versioned over time, just like file contents.

Choice of network layers

Subversion has an abstracted notion of repository access, making it easy for people to implement new network mechanisms. Subversion can plug into the Apache HTTP Server as an extension module. This gives Subversion a big advantage in stability and interoperability, and instant access to existing features provided by that server-authentication, authorization, wire compression, and so on. A more lightweight, standalone Subversion server process is also available. This server speaks a custom protocol which can be easily tunneled over SSH.

Consistent data handling

Subversion expresses file differences using a binary differencing algorithm, which works identically on both text (human-readable) and binary (human-unreadable) files. Both types of files are stored equally compressed in the repository, and differences are transmitted in both directions across the network.

Efficient branching and tagging

The cost of branching and tagging need not be proportional to the project size. Subversion creates branches and tags by simply copying the project, using a mechanism similar to a hard-link. Thus these operations take only a very small, constant amount of time.

Hackability

Subversion has no historical baggage; it is implemented as a collection of shared C libraries with well-defined APIs. This makes Subversion extremely maintainable and usable by other applications and languages.

>>> "Mary Ruddy" <mary@xxxxxxxxxxxxxxxxx> 11/09/07 9:20 AM >>>

When the Higgins project was started, Eclipse only offered CVS, not SVN. So even though SVN has advantages, we had to use CVS. SVN is now available to projects on request.

SVN has some features that will give us more control over the build process.

or example: Andy "used to use CVS on another project and

moved to SVN. Originally his project was hesitant, but they found that it made doing nightly builds much easier as a nightly build can be kicked off on a particular revision. Didn't need to worry about tagging, while letting developers check in ahead of the builds. Also can get atomic commits (all or nothing).. Also able to use SVN revision in the file name of a resulting build so that if someone subsequently reported a bug, we could go back to the exact source for the build".

On the Higgins developers call yesterday, we discussed the pros and cons. Dev notes to follow. We had a guest speaker on the dev call from the Financial Services Technology Consortium and we agreed to let him review the notes on his presentation to ensure accuracy, so the notes are delayed.

During the call

Andy was nominated to research preparations for doing a dry run of the conversation as part of formally preparing for any actual conversion. More to follow.

Below is the overview information I got on the process from Matt Ward, one of the Eclipse web masters:

Basically a project needs to decide on it's developers list, and then the PL sends in a request to have the repository moved from cvs to svn.

We use the cvs2svn tool, but there are a couple of caveats from the

documentation:

1) CVS doesn't record complete information about your project's history.

For example, CVS doesn't record what file modifications took place within the same CVS commit. Therefore, cvs2svn attempts to infer from CVS's incomplete information what /really/ happened in the history of your repository. So the second goal of cvs2svn is to reconstruct as much of your CVS repository's history as possible.

2)One of the most important topics to consider when converting a repository is the distinction between binary and text files. If you accidentally treat a binary file as text *your repository contents will be corrupted*.

For more details check out

http://cvs2svn.tigris.org/cvs2svn.html

-Matt.

Follow-Ups:
- RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Mary Ruddy

References:
- [higgins-dev] Research on how Higgins could convert from CVS to SVN
  - From: Mary Ruddy
- Re: [higgins-dev] Research on how Higgins could convert from CVS to SVN
  - From: Andrew Hodgkinson
- Re: [higgins-dev] Research on how Higgins could convert from CVS to SVN
  - From: Jim Sermersheim
- Re: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Daniel Sanders
- Re: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Andrew Hodgkinson
- Re: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Jim Sermersheim
- Re: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Jim Sermersheim
- RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Paul Trevithick
- RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
  - From: Andrew Hodgkinson

Prev by Date: RE: [higgins-dev] CardName and CardId
Next by Date: [higgins-dev] dependencies.redistributable
Previous by thread: RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
Next by thread: RE: [higgins-dev] Research on how Higgins could convert fromCVS to SVN
Index(es):
- Date
- Thread

Breadcrumbs