2008-10-27

Frozen packages and their busted __path__ attribute

The other night, as part of my quest to rewrite importlib's unit tests to run in proper isolation, I discovered that frozen modules are busted. Or, more precisely, frozen packages are busted.

Turns out that the __path__ attribute on frozen packages is simply set to the package's name. That is wrong as the attribute should at least be a list, even if it is just an empty one. But what is even more annoying from my perspective is that the import machinery actually relies on the attribute being a string. If you look at Python/import.c:find_module you will notice that frozen modules are only checked for if __path__ is set to a string or __path__ is not set at all. And that latter case only uses the tail of the module name, not the whole name, so just deleting __path__ doesn't work at the moment.

I suspect all of this was done this was as an optimization. By requiring __path__ to be a specific type you short-circuit the frozen module check in the common package case. Plus you don't need to do any loop through __path__ when it is just a string. But I don't get why the import is done using only the tail part of the module name when __path__ is not set. I guess it is just assumed to be right since the package check was already done. Although that package check does a lot more work than it needs to through string concatenation that could easily be avoided if it just used the full module name (probably just a remnant from an old way of doing things).