2009-03-31

Why Python is switching to Mercurial

Starting at PyCon 2008 thanks to Barry Warsaw and the Bazaar team I started thinking about moving Python over to a distributed version control system (DVCS). While I wanted to get offline commits for the benefit of non-core developers along with easier merging from 2.6 to 3.0 (ah, the days when there are only three branches under development), I knew that would not necessarily be enough of a reason for others to switch.

Come October I started a PEP for switching off of svn to a DVCS. Originally it was going to be hg vs. bzr, but enough of an outcry on python-dev led to me to relent to adding git to the PEP. With the list of DVCSs decided I began writing up common use cases that I and other developers have come up against in developing Python. I then had a representative for each DVCS fill in the PEP with the best solution to the use case (who I am grateful to for helping). This all became PEP 374.

And for me that's when the stress began. I was being bombarded on all sides on this pretty regularly. I quickly realized that choosing a DVCS was like choosing a code editor; it's a very personal thing for a lot of people. Plus I forget how big Python is now; when what I was doing hit the Net I ended up talking with developers from all three DVCSs which I didn't expect.

Time past and I tried to absorb all three DVCSs as much as I could, although with my internship and trying to finish importlib for 3.1 I only had so much time. I ran a survey of the core developers where I asked them to rate the three DVCSs as either better, equal, or worse than svn if they felt they had enough experience to have an opinion.

Based on the results of that survey where git was clearly the most disliked tool of the core developers, having the weakest Windows support, and not being implemented in Python, I decided to eliminate git from the running and announce its elimination at the first lightning talk at PyCon.

When I arrived at PyCon pretty much everyone asked me about the DVCS PEP. People wanted to know how it was going, who was going to win, and giving me support/pity for what I was going through. Guido noticed this and decided to end my misery by saying he wanted to make a decision by the end of PyCon. I said I was fine with that as one was already about to be eliminated and I knew my personal preference at that exact moment aligned with Guido's.

So I did my lightning talk eliminating git. Luckily that went well with only about two people telling me directly they disliked the decision.

But the more telling thing was what everyone else told me after that lightning talk. I ended up with a surprisingly large number of people telling me -- including core developers -- they wanted, preferred, or guessed that hg would now win. Now the guesses could be explained away by Guido having publicly stated he likes hg, but to me the amount of people telling me they wanted hg to be chosen was surprisingly large. And honestly no one told me they preferred bzr (although no one said I better choose hg over bzr either).

So Monday morning came around and I walked into the sprint. I asked Guido if he was ready to make a decision. He said yes, we both said hg, and so Guido tweeted the decision before telling python-dev that we chose Mercurial.

There has been a lot of speculation as to why Guido pronounced the way he did. On Twitter Guido said to read PEP 374 for the reasons. Since I helped write my PEP my reasons are reflected in the PEP.

Obviously community preference as shown at PyCon played a role. No one wants to choose a DVCS that causes the community to not want to contribute to Python. And I would never choose a VCS that would cause Guido to not want to work on Python. Some people seem surprised that something non-technical played a role, but ignoring social issues is to ignore how much open source is a social phenomenon. And we are not the first project to take social preference into consideration: I know both GNOME and Pinax chose git because their developers preferred git.

And there are technical reasons. Having hg being faster than bzr by 2x to 3x does matter to some extent. No one wants to cause someone to not contribute because they didn't want to wait for a checkout. And having personally experienced long checkout times because of a subpar connection to a specific server I know this can occur. The performance margin between hg and bzr is within reason typically and is not a flat-out deal-breaker, but it doesn't help either.

Bazaar also has its short timespan of format stability working against it. The tool has changed its format at least three times based on what the man page says (1.0, 1.6, and 1.9). Mercurial, on the other hand, has been stable since I think it went public or near that time. They take great pride in the fact they have not changed it. And that stability more aligns with python-dev's sensibilities regarding stability.

Stephen Turnbull's explanation of why on the bzr mailing list is also a good explanation of why we chose hg. Basically no one is saying bzr is bad, just that hg is a better fit for our needs on python-dev.

But the thing I really love about having made this decision -- other than I don't have to stress about this anymore -- is that everyone seems to be flat-out happy we made a decision to switch as well. Once again the Python community stands out as being friendly and understanding about stuff like this with no one really seeming to be upset that we made the decision we did.

As for when the switch will happen, I don't know. We are hoping by summer, but that is just a hope at the moment. We have to figure out the best way to convert our history as well as what workflow we want to have.