The results are somewhat interesting, especially when you take into account that Python 3 is the future and a lot of standard library consolidation occurred for it. Take, for instance, the winner at the moment which is "conslolidate(urllib, urllib2, httplib)". This is actually unwarranted as Python 3.0 fixed this with the http package. So most desired changed already happened. Go us!
The second one is datetime. Now I for one, do not see what is wrong with the module. It does not follow PEP 8 naming standards which is a frequent complaint I hear. But otherwise I personally do not see the problem. I asked on Twitter what issues people had and none of them really turned out to hold when you think about it. For instance, someone said they wanted to be able to subtract a time object from a datetime object. But that makes no sense as a time object simply reflects a point in time for any day. What this is asking for is to either subtract an amount of time represented by the time object or to assume the time object is for today. The former makes no sense as a time object is a point in time for an arbitrary day, which means you can't find the time delta as you don't know what day you are working on. But if you want a time object to implicitly represent today for mathematical reasons I would point out how the word "implicit" came into this. Python doesn't like implicitness. If you wanted the time to be for today then you should create a datetime object for today with the time set to your time object (which is not hard to do).
Basically every person who has ever told me that datetime needs work can never give me a concrete way to improve it or wants it to make assumptions that do not universally hold for everyone or make real sense when you think it through. So I consider this a side-effect of working with date and time being extremely difficult and a complete pain and people just wanting less pain somehow.
Coming in third place is logging. Now here is a package that I have never heard anyone say they thought was perfectly designed. While everyone is glad to have the package, it is rather obvious there is a Java heritage to the design. The package was added in Python 2.3, which puts it at the cusp of major changes in Python such as decorators and before other things such as context managers ('with' statements), the new I/O library, etc. There is even a chance that this poll is going to inspire someone to actually come up with an alternative. But as with all new modules wanting to come into the standard library, it will need to existed for at least a year and have become the best-of-breed solution as considered by the community before we can consider including some logging reboot.
27 comments:
I believe that people complaining about datetime need the features of dateutil (http://labix.org/python-dateutil) inside datetime module.
There is an old discussion in python-dev about this: http://mail.python.org/pipermail/python-dev/2004-March/043054.html
I don't really agree that the consolidation of urlllib(2) and httplib is done in 3.x, there's still a lot missing (broken proxy handling for instance).
I quite like the datetime module. So much so that I re-implemented it in C++. It's doesn't have an API that makes every single kind of datetime manipulation you can think of possible, but that's why I like it. Dates and times are very complicated and APIs that try to make everything possible a reality end up being mega complicated. I never want to have to think about using the right Calendar object when I just want to get the local time, for instance. It's easy to use and covers off the common use cases for such a thing quite well. One thing I would like is for it to be UTC-based by default, with local time being a second-class citizen. Another thing I'd like is for it to provide a simple way to get a timestamp from a datetime instance. Currently you have to do some math to achieve this:
delta = date - datetime.utcfromtimestamp(0)
return (delta.days * 60 * 60 * 24) + delta.seconds
I would like to see a standardized testing stub/mock library. However, maybe there is not enough consensus to accomplish this.
@casey - Write a PEP proposing it, put it on python-ideas. A "consensus" isn't needed, but you've got to make a good, solid argument for it.
The problems I have with Python's datetime module are the same problems I had with java.util.Date in Java. In no particular order:
* the library is difficult to use correctly (needs better facades)
* a rich API for traversing times is required (datetime.timedelta is OK, but could be much better)
* there is a ton of confusion about UTC time, local timezone time, and "timezoneless" times, none of which are addressed by the library itself.
* datetime does not provide a way to represent the abstract notion of "2pm on July 14" (timezoneless date/time without year), "start of first Monday in July EST" (timezoned date/time without year or precise day), etc. Instead, it only provides an API for representing "instances in time on particular dates in particular timezones".
* no direct support and facade for the complex but widespread ISO8601 standard
* the library provides no mechanism to make it easy to do testing, e.g. custom "chronologies"
These are the issues (and more) that were addressed by the Java "Joda Time" project and then, later, by JSR-310.
http://joda-time.sourceforge.net/
https://jsr-310.dev.java.net/
I've always wanted to "port" these libraries over to Python (a Pythonic port, of course), but just haven't gotten the time...
What bothers me about datetime is that it's not symmetric (by a long shot). You get fromtimestamp, but not totimestamp, strftime but not strptime -- ever tried parsing a string into a datetime? You have to do all sorts of dancing between datetime.datetime and the time module, *and* they're not easily convertible into each other, either: datetime(*time.strptime(...)[0:8]) (or whatever the correct spelling was) just makes me sick.
datetime's timezone support is just ferociously difficult to use. Every single time I need to do timezone stuff with it I spend a bunch of time testing and swearing and trying to figure out if I'm doing the right thing. I do not have this problem with many of the other standard library modules.
and datetime.strftime works only for years over 1900. A shame to get familiar with this in production
To everyone else's comments about datetime (especially Andrew's), I'll add that datetime only supports CE dates (year must be >= 1) and I'm not too sure about its handling of the julian-gregorian switch.
Have you seen this blog entry on problems with the Python datetime module:
http://www.enricozini.org/2009/debian/using-python-datetime/
A few fun facts from the article:
* Avoid using str(datetime_object) or isoformat to serialize a datetime: there is no function in the library that can parse all its possible outputs
* datetime.strptime silently throws away all timezone information. If you look very closely, it even says so in its documentation
* Timezones do not exist, all datetime objects have to be naive. aware means broken.
* datetime objects must always contain UTC information
* datetime.now() is never to be used. Always use datetime.utcnow()
* Be careful of 3rd party python modules: people have a dangerous tendency to use datetime.now()
* If a conversion to some local time is needed, it shall be done via either some ugly thing like time.localtime(int(dt.strftime("%s"))) or via the pytz module
* pytz must be used directly, and never via timezone aware datetime objects, because datetime objects fail in querying pytz
There are more...
Here's another one. It focusses on the 'Helsinki problem' but mentions a few other issues as well:
http://blog.twinapex.fi/2008/06/30/relativity-of-time-shortcomings-in-python-datetime-and-workaround/
From my point of view the lack of a robust and flexible date / time string -> datetime is the biggest problem.
Have you ever used pytz? Its documentation explains some of the shortcomings of the datetime API that makes timezone-aware datetime manipulation difficult and error-prone.
To summarize on datetime: it's not so much what it is, but what it isn't. It's like someone started writing the module, then just stopped half way through. Tiny additions have occurred since then, but just a method here or there when it really needs substantial new functionality. Nearly everything I miss in datetime is in dateutil; putting dateutil into the standard library would pretty much solve the problem.
Ian hit it on the head; it isn't so much the module needs a redesign (as the poll was meant to be about), but that it is missing some functionality. Adding to a module is much easier than getting a new module into the standard library. If people would like to see some functionality get added to datetime please consider providing patches. Sounds like there are enough people that if you tried to organize on comp.lang.python enough people might get together to flesh out the module.
A simple extension to Python's logging module would greatly increase its power and flexibility. See
http://edreamleo.blogspot.com/2009/08/better-python-logging.html
As the maintainer of the logging package it is interesting that logging has come third in your poll as needing a reboot. I like to think that I am fairly receptive to suggestions regarding changes to the package, and issues raised on bugs.python.org generally get fairly quick response from me. So I am a little surprised to see that Andrii Mishkovkyi has been apparently tasked with rewriting the package, without even involving me in any discussion! Can I assume that discussion on this topic will be on the Python Wiki at http://wiki.python.org/moin/LoggingPackage, or will have to look elsewhere to get the complete picture about discussions? I will certainly update that page with my initial responses to the topics mentioned there.
@Vinay Andrii hasn't been tasked with anything. He just noticed the result of the poll and said he might look at trying to do a reboot for the module. I don't know what Andrii's plans are as it came up on Twitter and has not moved beyond that.
And no one has suggested you are not receptive/responsive, Vinay, but you do have to stick with the general feel of the module and for some people that feel is not what they want.
Sorry, I mistyped Andrii Mishkovskyi's name.
And since I'm posting this comment, anyway, let me make another few points.
What exactly is the problem with the Java heritage - why does it keep getting mentioned in an arm-waving sort of way, without specific criticisms? I disagree that there is a "Java heritage" to the design, other than the idea of Logger namespace hierarchy, level-based filtering, and Handlers. These abstractions are central to any decent logging package, since when logging you have to consider "What happened?" (details of the logging call), "Where did it happen?" (logger namespace), "How important is it?" (logging level) and "Who wants to know?" (handler configuration). Note that with Filters, you are not restricted to level-based logging, but could easily use ideas such as mentioned by Edward Ream in his post referred to above.
Does the whole of the rest of the stdlib use decorators, new I/O, context managers etc.? Even if logging is somehow being singled out to make use of these things as much as possible, this might affect the *implementation* of logging, but do not necessarily invalidate the *design* of logging. Of if you think they do, how about a more specific discussion on the Wiki page?
@Vinay Ignoring the naming scheme, from what I can tell the comment comes from the fact that most people are familiar with log4j and such which has a similar structure. Maybe people want something less verbose? I honestly don't know since when I use logging I keep it to logging.error, etc. I'm just a messenger.
As for the standard library and picking up new features, some things get huge uptake, others do not. Decorators has been used in some places, not in others. Context managers, on the other hand, has been added to threading, io, and unittest to name what I can think of off the top of my head. You have been contributing to the core long enough to know that it takes either you or someone else to come forward to add a feature, so if a module has not picked up something it's because it is inappropriate or no one has submitted a patch.
As for any discussion, as I said, I am just a messenger. I am not enough of a heavy logging user to even criticize. I was just writing about the result of the poll and what I have heard from other people from time to time.
@Brett, I fully understand that you're just the messenger - and re. my comment about Andrii taking on a rewrite, I'm not touchy about it, hence the exclamation mark! And should you feel you want to make specific criticisms about logging, you are of course always free to do so, however much or little qualified you feel you are. Being a messenger is an important role, and I also wonder to some extent whether your categorisation of yourself as "not enough of a heavy logging user" is itself a criticism ;-)
One can't please all of the people all of the time, so there will always be people who find logging doesn't fit well with the way they think. I have no problem with that. (It's not too different from the phenomenon of apparently intelligent people who "love" Ruby but don't "get" Python, or vice versa.) But the people who are happy with logging would scarcely have bothered to stand up and say so, which gives, in a poll like this, a louder voice to the naysayers. That's fine too, as I would expect logging and the rest of the stdlib to improve by Python devs listening to the naysayers rather than complacently basking in a let's-pat-ourselves-on-the-back atmosphere.
You say that "you do have to stick with the general feel of the module and for some people that feel is not what they want." and to that, I have already responded in the second paragraph of this comment. However, if I am supposed to make specific changes to the logging package to accommodate some of the criticisms (and I am very open to this), one can hardly expect me to do this based on "the general feel of the module". Specific criticisms, patches, and RFEs can always be logged on bugs.python.org, and more general discussion can happen on python-list, python-dev and the Wiki. I've already responded on the Wiki page, and look forward to more specific, substantiated criticisms which give me something concrete to work on.
I've commented at length to your post only because your post was publicised enough for me to come across it first, and I assume many others will come to it too. My comments should serve to direct everyone with sufficient interest to the Wiki page mentioned above, or to use python-list/python-dev/Roundup/Wiki to continue further discussion. I hope I have responded in specific here only to comments which you or others have made here.
@Vinay I fully admit I don't do enough logging in my code. So yes, it is a partial criticism at myself. I am getting better, though!
As for the naysayers having a louder voice, there is a purpose to having the X vote to vote down a result. And that was used in instanced where people did disagree that a module should be considered a redesign.
And I think you have responded well, Vinay. Hopefully something positive can come from all of this.
@Vinay
I'm sorry if my post on twitter seemed offensive to you -- I didn't mean that, I just wanted to raise a discussion between users of logging package. This discussion evolved into this page on the wiki http://wiki.python.org/moin/LoggingPackage on which you provided great answers.
Also, I must correct Brett -- I haven't yet volunteered to rewrite logging package, I'm considering just sending patches for the things I don't like. This was Jesse who told me that I should rewrite logging from scratch, but I'm not going to do that until I'm absolutely required to. :)
Anyway, to the logging topic -- main concern among users of logging package is that it just "doesn't feel right". Maybe it's because the whole concept of logging doesn't feel right for Python people. :) Or probably it's because logging is quite heavy-weight and Pythonistas are used to have "basic building blocks" in stdlib. I'm among those people for which logging just doesn't feel right, but I can live with it until I can see a way to change that.
btw, thanks for the replies on the wiki.
I would have put zipfile into this; I've prefer something like a vfs for this ideally.
Trying to subtract a time object from a datetime object makes it sound like people are using time objects as if they were deltas or intervals, instead of representations of a time of day. That in itself indicates a probable mismatch: The one obvious way to do it is wrong, which means the right way should be more obvious.
I consider this more of a bug of datetime: datetime.time() is evaluated as False, which has no semantic reason (midnight is false?)
>>> if datetime.time(9,30):
... print '09:30'
...
09:30
>>> if datetime.time():
... print '00:00'
...
>>>
there is a need for a set of utilities around datetime, for example, dealing with cyclic time is a pain (11pm + 3 hours = 2am) - of course, there is a problem with day light saving time (sometimes 11pm + 3 hours isn't 2am), but that happens only when the date is relevant to the problem.
moving on, the UnitTest module works nice for small projects, but kinda breaks when you use it in a large scale (for example project/module/class fixtures, the ability to run tests from multiple places, time depended tests)
Post a Comment