Forum OpenACS Improvement Proposals (TIPs): TIP #25 (Rejected): No direct manipulation of the CVS repository

as raised here, somebody seems to have broken the CVS repository by manually shifting a bunch of packages to contrib/obsolete-packages.

I propose that no direct manipulation of the CVS repository occur without explicit approval in a TIP.

Russel,
there was no breakin, and yes, there was a TIP: https://openacs.org/forums/message-view?message_id=120868.

Sorry about the inconvenience caused! I didn't know of a better way to move the packages than to move the dirs in the repository. If you could outline here how it's most appropriately done we'll do a better job next time.

Thanks!

See http://www.gnu.org/manual/cvs/html_chapter/cvs_15.html
and http://www.gnu.org/manual/cvs/html_chapter/cvs_14.html
for how things are usually done.

The tricky issue is that the modules file also dictates
what is in core and that is not really tied to a particular
version so you still have issues trying to check out an
older release from cvs.

Peter - I didn't say that the repository had been broken into, I said that it has been broken. As a result of manually shifting directories around inside the cvsroot the revision history no longer works, such that tags/branches made over a fortnight or so ago no longer refer to the same source tree they did when they were tagged. I'm not talking about a TIP for the decision to move those packages to obsolete, I'm talking about a TIP for messing with the repository structure directly.

One possible way of making that change that would both preserve the revison history of those packages and prevent the current brokenness would be to duplicate rather than move the repository directories in question, and then in a correctly checked out working copy "cvs rm" them from HEAD. This would leave entries for (for example) acs-workflow in the main packages directory so that the tags for 4.x releases could still refer to the actual source trees that were released (strictly speaking you'd get a copy of the released sources plus duplicate copies of the packages in question in the obsolete directory, but I'd rather have a second copy sitting around that I won't use, than have no copy of a package I depend on).

Russell's "one possible way" above basically describes the preferred method to use for this sort of CVS repository surgery. (Well, my preferred method anyway; I think most people's.) I am not aware of any good reason break version history by simply moving repostory files from one place to another, rather than by first copying them then cvs removing them from their old locations in the normal fashion.

And to be blunt, anyone directly screwing with the CVS repository should know these things.

Fortunately, sounds like nothing irreversible was done! If people want it to be fixed, it can be fixed, the repository files can simply be copied (not moved) back to their old location, and cvs rm'd there.

Now, the modules file, that does sound trickier, and I've never used one myself so I don't have much useful to say. You really can't version it and tag it, or something like that? That is yet another serious CVS mis-feature then.

Also, "No direct manipulation of CVS repository without a TIP" seems sort of silly and overly bureaucratic. However much we wish it wasn't so, there are things CVS just plain doesn't let you do any other way - there can be good reasons for a maintainer to directly manipulate files in the repository. Not regularly, probably never even often, but I've done it myself often enough times in the past, at work and etc...

Okay, now please go ahead and kill me, but why are we sticking with CVS? Wouldn't it make sense in the recent light of events to have a look e.g. at subversion (I'd have said perforce, but they seemed to have changed their licensing policy). Just food for thought and I'm absolutely aware of the fact that subversion is still not in the 1.0 release state.
I had this exact conversation with Mark Aufflick yesterday - I don't think the "pre-beta" status of svn is really an issue for stability and repository safety (svn FAQ, some deployed sites). The problem is that they don't issue binaries and they're not guaranteeing client/server interoperability over more than a (3-week or so) release cycle, so everyone who currently works on OACS through CVS rather than release tarballs would need to start building svn client software and keeping it in sync with what is being run on the OACS repository. It's doable, and probably not that much of a problem for the companies and individuals who are the core of OACS development, but it places a burden on newcomers and casual developers that could well drive them off.

which sucks, because CVS blows goats and subversion fixes the bits that suck while keeping the familiar CVS UI...

If there's any interest in following this sort of path I'll find out what the actual (rather than "supported") level of compatibility is between cross-version clients and servers...

I think we should wait until it hits 1.0.  Don't we already have enough prerequisite programs that haven't hit 1.0 yet?
Collapse
Posted by John Sequeira on
It's not my intent to pitch a particular CVS alternative,  but folks interested in the topic will find this fairly recent write-up interesting:

http://seppuku.editthispage.com/2003/07/30

Malte,

I've brought up this issue with the other OCT members. I am preparing documents, migration path, and a TIP for OpenACS to switch to using the GNU Arch SCM to replace CVS.

It is simple, has very advanced features for distributed repositories and merging, makes is much easier to maintain an OpenACS repository with your changes while still being able to synch with the main repository, etc.

Talli and Ben Bytheway are the ones who introduced me to it. I've been using it and it is great.

-Roberto

In addition, GNU Arch HAS hit the 1.0 milestone.  Version 1.1, which boasts a host of new features, is due out at the beginning of November.
If it works as advertised, Arch should be significantly better for OpenACS use than SubVersion, due to Arch's direct support for distributed development (rather like BitKeeper's).

This February 2003 Linux Kernel list thread on version control had some interesting info. From that, it seems likely that Arch is still, at least currently, noticeably inferior to BitKeeper.

E.g., Olivier Galibert said, "Hell, arch is still at the update-before-commit level. I'd have hoped PRCS would have cured that particular sickness in SCM design ages ago." which sounds interesting, although I don't know what that means exactly. (But Tom Lord thinks it mostly meaningless drivel, and explains why.) And Linus Torvalds mentioned, "Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that way indefinitely since most people don't seem to even understand _why_ it is superior."

It is not clear to me whether Arch's (assumed) inferiority to BitKeeper is fundamental, or Simply a Matter of Programming. Lacking any additional information, I would tentatively conlude the latter - Arch can catch up (eventually even surpass?) BitKeeper - but that doing so is probably hard.

Even so, that doesn't matter to OpenACS, as we've been getting along ok with CVS, which is much, much more inferior to BitKeeper than Arch could be! So if Arch is judged to be, useful, currently a substantial improvement over CVS, well-designed, likely to undergo regular further improvement and maintainence, and a better choice for OpenACS than other competing Open Source revision control tools, then OpenACS should use it.

Note that above I said competing Open Source tools. What about closed source tools, like BitKeeper? The Linux kernel project uses BitKeeper. The Linux kernel is an extreme case of needing the best distributed version control system you can get, and needing it right away. Linus obviously decided that this need was so compelling and immediate that it overroad any concerns over BitKeeper's non-open-source license. OpenACS is not in that situation; we have no such extreme and overriding needs. Therefore it should be easy and best to simply side-step all the huge acrimonious licensing flamewars that the kernel folks have gone through, by simply going with the best Open Source tool, and looking at closed-source tools primarily only in order to better understand and evaluate the open source ones.

Perforce? It's closed source of course, and I've never heard any arguments on why it's better than BitKeeper. My guess is it isn't, at all. If we were going to go with a closed source tool (bad idea), I think it'd be hard to come up with any good reasons to use anything other than what the Linux kernel folks already went with - BitKeeper.

Tom Lord (original author of Arch) made some fairly insightful criticisms of SubVersion back in February 2003. Basically, he thinks the SubVersion project is fundamentlly a failure, and is interested in understanding why and how.

Roberto, I would be very interested in seeing your stuff on OpenACS switching to Gnu Arch!

Oh, another important reason to avoid any non-Open-Source revision control (SCM) tool: (Important because it effects me :)

Whatever SCM tool the OpenACS project uses should also be easily useable by everyone who uses OpenACS, and can be depended upon to stay that way indefinitely, even if those OpenACS users are doing their own closed source work! This requirements turns out to basically means that only an Open Source tool will do.

I've written OpenACS packages which are used only within my own company, and which we generally don't intend to ever release at all, never mind release as Open Source. It would be really frustrating if due to licensing concerns, for my own private OpenACS work, I had to use a different SCM than what OpenACS uses! Probably I wouldn't do that, I'd just shell out the bucks for a license of whatever closed-source SCM tool I needed - but that's a decision we probably don't want to force upon OpenACS users.

Andrew, thanks for the update on revision control systems! I just navigated to the BitKeeper website and found their product comparison section very enlightening. Having lived with CVS for so long I never realized so much sophistication could be built into revision control systems!
I'll be honest and say that I don't fully understand the DAG-based ultra-cool feature of BitK33per (using the 3's to avoid trademark issues), but I get the feeling.

Here's what I'm interested in for OpenACS:

- Making it easier for developers to work on their own tree of OpenACS and a) still be able to synchronize with the main tree, b) allow the OCT to grab their changes.

- No more merging nightmares and branch confusion.

- Keeping track of files regardless of naming in different trees.

arch gives us all of those. From my research, it'll make our lives much easier, and will greatly benefit those using OpenACS and wanting to have their own separate tree (i.e. a customized installation) while still being able to sync with the main tree.

Now I'm sure BitK33per would give us all of those and perhaps more. Their (Tcl/Tk-based!) graphical tools for one, are very impressive. Linus can afford to mandate a proprietary product that he helped design. Perhaps we could do that as well, but I'd rather not dwelve into that possibility.

We could use BitK33per, but that would mean that no one working on OpenACS would be able to contribute to other SCMs.

And honestly, I think arch will solve our problems just fine.

-Roberto

arch looks pretty impressive. I've made it part way thorough the tutorial and it seems to work as advertised. The commands are more understandable, as is the branch naming being related to the directory structure of repository.

One deficiency, which is probably just a configuration detail, is setting up a networked repository. The only safe method is sftp. Possibly ftp can be redirected over ssh?

One very nice feature is the use of a log file, instead of a log message. The tutorial suggests creating the log file before you start work and then add to the file as you make changes. Once you commit the log file is removed.

The sftp that arch support is in the stock openssh package.  It doesn't require anything else to run.
A word of warning about sftp--it does not log transfers at all, unlike a standard ftp daemon.  We may not care.

I googled for this deficiency and it looks like the sftp developers are addressing it, but it may not appear in the ssh release soon.

I don't believe scp logs transfers either, right?  I don't see
logging as being of much value one way or the other.
Arch also can be accessed via WebDAV, which is nice.
OK, I just spent part of the afternoon reviewing Gnu Arch.  All I can say is "whoa".  Talk about having your paradigm shifted.  Three things really concern me about this:

1. There are very, very few tools for manipulating an arch repository.  This is mostly because the project is new.

2. Arch cannot be ported to Windows (via Cygwin, quickest way right now) because Windows has problems with pathnames over 255 characters.  There are ways around this, but they involve a few too many people (Cygwin needs to make some changes and Microsoft would need to make some changes), and so a Windows version of arch is not likely to exist anytime soon.

3. We're going to need a ton of very clear documentation on how to make arch work for us.  Unlike cvs, there isn't really a set of best practices since it is so new.

I could ramble more about this, but this seems a good place to stop.

What do you mean by "manipulating an arch repository"? The repository should not be directly manipulated as we often are forced to do with CVS. It can be manipulated in many ways with the "tla" command, just like the "cvs" command.

Regarding (2): How many of our _developers_ committing code to the OpenACS tree rely exclusively on Windows, without access to a *nix box of some kind? My experience with the community says that it's a very very limited number. Do we even have anyone running OpenACS on Windows?

Sure it would be great if it worked everywhere, but we have to consider if it's worth it to continue to inflict pain on the entire development community because of a very restricted set of developers that we're not even sure exists.

Of course we'd continue to provide things like nightly tar balls, bug-tracker where people can submit patches, etc. Using arch would definitely _not_ isolate users who rely exclusively on Windows.

Re: 3) Very much agreed, and I'm working on it. As far as "best practices" for CVS, what I usually see is a hodge-podge of hacks to work around its defficiencies. But if you're working on a system that doesn't have those defficiencies, many of the "best practices" are not necessary anymore.

-Roberto

What do you mean by "manipulating an arch repository"?

For example, for CVS I can interact with a CVS repo with any of the following tools:

  • WinCVS (CVS Gui for windows)
  • TortoiseCVS (Windows Explorer shell extension)
  • pcl-cvs in Emacs
  • CVS plugin for Eclipse
  • ViewCVS web interface

Just to name a few. That way my less command-line savvy users have the warm fuzzies of a GUI. And WinCVS in particular has a really cool graphical branch viewer that makes it easy to see what happened in the past. Those kinds of things. "Third Party Tool Support" for lack of a better term is what I was thinking.

Emacs integration in particular is something we would really miss.

Roberto, enough people are on windows or are potentially on windows that this is a major problem. If they are, it's reasonable to recommend they use Cygwin, but if that part is busted, then they are massively screwed. The only resolution would be for them to use a CVS repo which is synched with Arch, but that is far less than optimal.

I was the first advocate of moving to GNU Arch, and I even host gnuarch.org (along with Mat Kovach). But I still think Arch is 6 months to a year away from being ready for the OpenACS community to embrace it.

CVS may suck, but it's a least common denominator that most people understand and can use. If individual developers are willing to use Arch and synch that with a canonical CVS repo, that would be the fastest way to get moving with this stuff.

talli

Regaring the actual proposal.

I think that a strict rule never to touch the repository is not necessary. We can perhaps have a document that includes reminders of how to move packages etc in the openacs repository.

I'm going to go ahead and VETO.  I do understand there are problems and that they are important to address but I am not in favor of trying to solve this problem by creating more TIPs.

I do want to thank Russell for bringing up this important issue.

Collapse
27: Gnu Arch vs. Monotone (response to 1)
Posted by Andrew Piskorski on
A possible alternative to Gnu Arch might be monotone. I'd never heard of it before and don't really understand how or why it is better/worse than Arch or BitKeeper, but it claims to be a distributed version control system (like Arch and BitKeeper), and uses SQLite internally, which may be a good sign.
Collapse
Posted by Andrew Piskorski on
From John Sequeira's weblog, I just noticed that Martin Pool (developer of rsync and distcc, etc.) has extensive thoughtful commentary on version control systems.
see sftplogging.sourceforge.net
Collapse
Posted by Andrew Piskorski on
Also Darcs, which seems to have some possible advantages (simplicity, its "theory of patches" based merges, MS Windows support) over Arch. My gosh, first there was Bitkeeper, now also Arch, Darcs, Monotone, and also SVK and something called Codeville, which I just stumbled across. All of these are attempting to do distributed version control, but the, ah, wealth of different tools here makes picking one a bit complicated.
Also baz, which is a branch of Arch that has a lot of support. Its focus is on usability.