« Head in the Clouds, Echo | Main | The Branches That Bind »

Version Control System Shootout Redux Redux

Late last year, as the Mozilla project began looking at the tools we'd need for the Mozilla 2 development effort, it became clear that trusty ol' CVS, while carrying the torch for so long, would not meet our requirements for ground-breaking development.


Mortal Kombat II, Version Control Edition: The Prologue

At the last Mozilla Summit, we all met in a session and were able to narrow down the choices to two contenders for consideration: Mercurial and Bazaar.


Brendan's merge requirements bring all the version control systems to the yard, and they're like "It's better than yours..."

We were able to narrow down the decision quickly because the type of development Mozilla 2 will require dictates a model that differs from CVS in ways that systems that attempt to emulate that working model, like SVN, would not work well for us. We turned our attention to the big names in open source distributed systems: Git, Mercurial, Bazaar, and Monotone.

While they've made recent progress, Git was lacking in Win32 support and it was unclear that this would ever change and if it did change, it was unclear that Git-on-Win32 would ever become something more than a second-class citizen. As good, performant Win32 (and Mac and Linux) is a hard-requirement, Git lost in early Kombat rounds. This is unfortunate because (as we would soon find out), lots of issues with the other systems did "just work" in Git.

Monotone was ruled out relatively early as well, due to the similar Win32 performance issues and not wanting to split developer resources with Monotone fix- and feature-requests.

At first, this left Mercurial standing in the ring. Ahh... at last, a simple Mozilla project decision.

But then, during the version control discussion at the Summit, Bazaar was brought up. It had a decent set of features that sounded interesting and useful to us.

We started investigating. And suddenly, there were two.

[Insert a quarter to continue...]


If that didn't cause some repository corruption, I don't know what would...

Mercurial had been a strong contender from the beginning: a distributed version control system that was fast on all platforms and, by initial accounts, handled importing the Mozilla CVS repository and with useful developer extensions, like mqueue.

But as Benjamin and I started to play with it more, we were unable to replicate an import of CVS. We both hit a number of hurdles, and it became a game of whack-a-mole: first it was encoding problems with commit messages. We'd hack around that (with a hidden environment variable setting), and then weird contortions in the CVS repository would cause breakage. "Ok," we thought; "those are branches we probably care less about; let's ignore them." Then, the parser barfed on commit messages that contained output that randomly looked like cvsps!

Mecurial was fast enough, when you had hopped over (and sometimes through) enough hurdles that you could get it to complete the requested operation.


It looked like handling larger repos wouldn't be a problem...

In parallel, we started looking thoroughly at Bazaar. As with Mercurial, we started the evaluation with an import attempt.

After working through some initial issues with the Bazaar developers, who were extremely helpful, we had an import going within about a week. In contrast to the other experiences we'd had, no matter what weirdness was in the CVS repo, Bazaar's parser and import logic just worked around it, with reasonable results.

That bad news is that the import took a long time.

And I mean a long time.


Processed 174786 patches (174786 new, 0 existing) on 743 branches (1 tags) in 2890557.7s (0.06 patch/s)

Yes, you're reading that right: on a dual-core, dual-CPU 2.8 GHz Pentium 4 with 4 gigabytes of memory, it took over a month of constant runtime to complete a trunk-only import of the CVS repository.

A month to import might be ok, if day-to-day operations were on par with Mercurial (or even CVS), but jst did some tests, and found out that, unfortunately, they weren't.

We did most of these tests with the bzr version available at the time, 0.14, and jst re-did them, multiple times upon feedback.

With some more help, we got the times on jst's test runs down to multiples of Mercurial's performance (as opposed to orders of magnitude). In the end, the third system succumbed to concerns about the first two: performance.

Astute readers of m.d.planning (or obsessive observers of the mozilla.org DNS zone file) may have noticed a new entry recently: hg.mozilla.org.

After a lively discussion, we decided to import the trunk CVS code using an epoch date (which turns out to be "22 Mar 2007 10:30 PDT", aka the MOZILLA_1_9_a3_RELEASE tag's pull date) as opposed to importing the whole history. cvs.mozilla.org will never go away, even if it's retired to the old-VCS's home and put into read-only mode, and the import, even of only the trunk, created repos with large initial pull times and space requirements.

The contents of the initial Mercurial import can be pulled from CVS by pulling the HG_REPO_INITIAL_IMPORT tag from mozilla/ in CVS (only the imported files were tagged, so client.mk-fu isn't required.

As those who spent a bunch of their time and energy involved in this decision can tell you, it wasn't an easy one. There were a lot of passionate discussions about features vs. performance tradeoffs, the priority of our requirements, and a desire need to put the rubber to the pavement on Mozilla 2 work.

It's important to realize the headline here isn't "Mozilla Project picks Mercurial for Next Generation Version Control System." It's "Mozilla Project moves to Next Generation Version Control System."

There was a lot of support in the project for both tools, and I personally know that the Bazaar developers spent a bunch of their 0.15 development time working on some of the performance issues we ran into. The great thing about these "nextgen" version control systems is that they all track the information necessary to re-create history. Because of this, switching between systems is much eaier, and in some cases, using your favorite system is possible (bzr, for instance, can pull directly from Mercurial).


Player II is up!

There's been serious talk of re-evaluating our needs and the state of these two systems in 9-12 months, after we have some concrete development, build, and tooling experience with Mozilla 2 and Mercurial. It's not a given that we'd make the same decision at that time.

One of the take-aways from this process is that all of these systems are relatively new (especially considering CVS started development in the mid 80s). None of them are perfect. They all have strengths and weaknesses and it's been interesting to see how they all solve problems, and to watch all of these systems progress in even the short amount of time we've been evaluating them.

Distributed version control is a new and strange world... but we're excited to be here.

Finally.

I'd like to give a special shout out to John Meinel of the Bazaar team for his help in this process; he spent a bunch of time working with us on import strategies, multiple, custom imports, and who loaned us their completed Bazaar import for testing purposes. We really appreciate you and your community's help.

TrackBack

TrackBack URL for this entry:
http://weblogs.mozillazine.org/mt/track.cgi/11628

Listed below are links to weblogs that reference Version Control System Shootout Redux Redux:

» Version Control Systems from Confluence: Spike
Accurev, Perforce and Subversion are currently in use. Alternatives Alternatives to Accurev, Perforce and Subversion include: Bazaar [Read More]

Comments

I was able to import the entire Mozilla CVS history into git in about four hours on a single core 2.8GHz machine with 2.5GB RAM. Resulting repository was 450MB, compared to the 3GB of CVS files. Have you considered applying Mozilla resources to the Windows port of git? Wouldn't that be easier?

Love the pictures ;)

I am curious, what were the problems you all had with CVS? And did you guys investigate Darcs at all? The Haskell community is pretty smitten with it (though I cant vouch for it personally).

Is there a public repository from which we can hg clone? Where?

Gosh, those screenshots sure do bring back memmories. Long days molesting a keyboard with two people. Molesting eachother black and blue in the process as well.

Also kudos to the moz devs for taking this big step. Will be very interesting to see how mercurial holds up.

I'm curious as to why Darcs http://www.darcs.net wasn't considered. I'm not sure how the performance would have stacked up on such a large codebase but theoretically they've got it working at reasonable speeds with the linux kernel now so...

Personally I love the way Darcs goes about things and the work they've put into the human interface.

Great post!

It's surprising how many free VCS developers don't "get" the importance of win32.

Don't rule out Git because it doesn't have the windows support that you want. Git is several orders of magnitude faster than the closest contender on GNU/Linux. How Git with a proxy to enable other systems to connect to it? Wasn't there a Git backend support for Bzr?

How recently did you look at the win32 port of Git? git-mingw32 develops rapidly, and all reports I've seen suggests that many of the massive speed issues have gone away. You might find it a constant factor slower than the GNU/Linux or OS X port, but you shouldn't find it big-O slower anymore.

A while back I played with importing the Mozilla codebase into darcs (nothing to do with the Mozilla evaluation, just personal experimentation as I'm a darcs fan too). Without even importing the history, darcs couldn't handle the first patch importing all the files due to running out of memory.

Adding the +RTS commands to increase the heap available to it didn't help much. I got the initial patch to record but additional 'whatsnew' and other operations fell over with not enough heap, even with 2GB assigned to it.

Please reconsider git-mingw32

masukmoi: a generation of free VCS' "got" the importance of win32. They sucked the same way on all platforms and very few people considered them.

Then Linus decided the situation was hopeless and created his own VCS in a few weeks. Of course it was Linux-oriented. hg emerged at about the same time and competed against git - again Linux was the main platform because win32 people were content with CVS or expensive proprietary VCSs.

Now win32 people complain they were left out - nobody forbid them to join the fray at the right time. They didn't and now there's some catching up to do.

What about svn?

Would be interesting to know how long import took with hg/Mercurial.

Colin: Um, git was written by Linus personally to handle Linux kernel development. What is this "importance" you talk about? Win32 people should be grateful that he merely accorded Win32 support zero weight and didn't do anything to actively sabotage the port.

The big issue is that it was developed as some core "plumbing" written in C that does all the performance-critical stuff with a layer of "porcelain" shell scripts on top to make it more comfortable to use.

And the Bourne shell just doesn't work well on Microsoft OSes.

But there's no malice. Git lost its dependence on symbolic links very early, and there is a project to rewrite all the shell scripts in C. But frankly, it will take some Win32 programmers to drive that to completion; for the Linux/Unix guys, it works great already.

Really, hg and git are very very similar in their basic models. They were written at the same time and share a lot of ideas. Converting between them is easy if you ever want to do it later. Indeed, it's probably possible to connect them "live" so people can use both and interoperate seamlessly.

But after getting used to git's ludicrous speed, I'm pretty much spoiled for anything else. I get grumpy when diffing 100 MB of source takes more than a second.

And merging... it took me quite a while to get used to the fact that git actually *does* something in the brief instant before it gives me back a prompt.

And now there's a Summer of Code project to implement "gittorrent", so making code available by git will become relatively painless for the server.

ZeD: svn is a centralized VCS, a "CVS done right". There's still one center that you need to talk to for every operation. Git and hg are fundamentally distributed; there's no perferred "center". You download a copy of the entire repository, history and all, and extend it. You can share those extensions with other people, either "pushing" them (if you have access), or by announcing a server and letting others "pull" from you. You're welcome to implement an upstream/downstream hierarchy, but the tools don't enforce one.

Looks like Diane Trout was able get the whole mozilla cvs root converted to Mercurial:

http://alienghic.livejournal.com/tag/mercurial

For those people who talk about git-wing32 I think you miss the point of what makes a VCS nice on Windows. Take a look at TortoiseSVN to get an idea. Just command line won't cut it.

Did you see that according to the Mecurial developers the Mozilla project didn't do much to contact them?

http://www.selenic.com/pipermail/mercurial-devel/2007-April/001462.html

To you folks calling for the mozilla guys to reconsider 'git-ming32': googling for that keyword at this time results in exactly 1 hit, this page. If it really does exist, then please supply a link or a better phrase to google for. Even the Git homepage does not contain a link to such a thing.

I'm glad Git is fast and all but it will continue to get passed over for big projects like this one as long as it doesn't take Win32 seriously.

Have you considered applying Mozilla resources to the Windows port of git? Wouldn't that be easier?

How could that be easier than using Mercurial, which seems to already work acceptably as is?

There's a slight typo in your google search. You're missing a 'w' before the 32. Nonetheless, I'd recommend using the phrase 'git mingw' as a better alternative if you interested in finding more on the project.

I'm not 100% sure if this is the "official" project page, but it seems like it: http://code.google.com/p/msysgit/

FWIW, there's a TortoiseHg under development:
http://tortoisehg.sourceforge.net/

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

::...
免责声明:
当前网页内容, 由 大妈 ZoomQuiet 使用工具: ScrapBook :: Firefox Extension 人工从互联网中收集并分享;
内容版权归原作者所有;
本人对内容的有效性/合法性不承担任何强制性责任.
若有不妥, 欢迎评注提醒:

或是邮件反馈可也:
askdama[AT]googlegroups.com



自怼圈/年番新

DU21.7
关于 ~ DebugUself with DAMA ;-)


关注公众号, 持续获得相关各种嗯哼:
zoomquiet


粤ICP备18025058号-1
公安备案号: 44049002000656 ...::