2009-05-30

The staleness of the standard library and adding new things

If you happen to follow certain members of python-dev on Twitter you may have noticed a discussion going on over Zed Shaw's "Curing Python's Neglect" post about inconsistencies in the language and the standard library, focusing mostly on the latter. This caused some people to say Zed should be lindberg'd (link is currently down) for complaining about the state of Python's standard library without offering solutions. This then led to Zed responding about how there is a double-standard for accepting changes between Python committers and everyone else which has led to crap making it into the standard library.

There seems to be a misunderstanding here that I don't think is isolated to Zed (and thus this post is not meant to be picking specifically on Zed, he just happened to spark the discussion) involving Python's history and how things operate over on python-dev that I would like to clear up.

I want to start off by acknowleding the fact that the standard library is far from perfect. There are modules in there that are of subpar quality when compared to other parts of the standard library. And even those that are in there and are considered good can have some inconsistencies in their APIs that can make them enfuriating to use on occasion. There is work that could stand to happen to clean up some things.

Today, to get something added to the standard library, we ask that the code be considered best-of-breed by the community, that the developer promise to maintain the code, and that the maintenance happen within Python's development process and not outside of it. There can be a PEP involved depending on how large the code is and how much of a fight there is to get the code included. And of course this is all dependent on whether the core developers believe the module has enough widespread usefulness to warrant adding it to the standard library.

But historically these requirements are a recent turn of events. I would say that these requirements for inclusion into the standard library did not start to be seriously enforced until some time during the Python 2.5 development cycle which started in November 2004 and ended in September 2006. If you look at the standard library a lot of it predates 2.5 by several releases. If you look at the modules that Zed said could use some improvements you will notice that none of them were added to the standard library recently (or even in the standard library as Zed mistakenly thought that setuptools and easy_install were):
  • os: 1992 (I don't even know what release that is)
  • time: 1990 (less than a year after Python's creation!)
  • datetime: 2.3
  • email: 2.2
My point is that if you look at a module in the standard library and it seems a little stale and that the API could have possibly used some more public vetting then it probably could have but was accepted anyway. People need to realize that Python was started by Guido to scratch an itch and just happened to turn into this massively popular programming language. Python's popularity took off starting shortly after the first PyCon in 2003, IMO. But because the growth was organic python-dev did not realize how much more careful we had to be with standard library inclusions until a few years later. Back before we instituted more stringent acceptance requirements things were added simply if people offered up the code, were willing to maintain it, and python-dev thought it would be useful.

But admittedly, even today some modules get to short-circuit the acceptance process. Importlib is a perfect example of this as I didn't have to put it out there for a year to make sure people thought it was best-of-breed. I didn't have to write up a PEP to vet the API. The only thing I had going for me was a known need for what importlib provides. Otherwise the code went in because I wanted it to go in and I happen to be in a position where I can make that simply happen.

But I skipped the steps only because of other things I did. I skipped the year-long acceptance by the community as I was targetting Python 3.0 which was not released as final yet. I also blogged extensively about importlib and its API so it was at least discussed publicly somewhere. There were several core developers I talked with about the API and who also watched every commit I made with interest and provided feedback. I talked with people at PyCon about this stuff. There is a reason I spent years working on importlib.

But importlib is an exception, not the rule. If my bootstrapping goal did not exist importlib would have existed externally for at least a year before I tried to bring it into the standard library. And if I had gotten any realistic pushback from python-dev I would have waited for inclusion, but thankfully I didn't receive any. And trust me when I say that people on python-dev are quite happy to share their opinion if they think something is a bad idea. While people who have proven themselves do get to skip some steps, don't think that they get to do whatever they want to the standard library.

With all of this in mind, how should modules get fixed? If an API is truly deemed poor and in need of a replacement it can either grow a new API next to the old one or we introduce a new module to make the old module obsolete. Doing the former is nice for stupid little API mistakes, but does require working properly with the pre-existing module which might be a hinderance.

Adding a new module breaks cleanly from the old module, allowing a new API to exist from scratch. But this has the issue of being a much larger burden for support as there will simply be more code. There are possible stability issues, etc. Even if code has existed out in the public for a year does not guarantee top-notch quality.

And of course this all assumes that an API is actually fundamentally broken. Minor issues can be overlooked in the name of prior knowledge. When we add something to python-dev we are essentially asking every Python developer out there to learn about this new code and consider using it which is quite the mental burden if you have been using the older API for years. And it is also possible that you are in a minority in thinking it is broken.

This is all rather complicated and nuanced. Coming up with a solution that pleases everyone is impossible. But python-dev continues to try to do its best and hopefully that does please most people.