2006-12-14

Import rewrite basically semantically complete!

Today I ran regrtest using my import rewrite complete with bsddb, compiler, decimal, network, and subprocess resources, and (for the most part) the tests passed!

Ignoring the bsddb3 failure (which was over some commit call not being made sand an individual re-run had a different error so I think it is transient), there are only four issues at the moment, and they all have something to do with import being rewritten in Python.

First runpy (the '-m' command-line option) doesn't work. It requires the optional get_code method as specified in PEP 302 to be defined. When it isn't, as is the case with the built-in import, it uses an undocumented function in pkgutil to get one. But the new code does define a loader but no get_code, so it throws an AttributeError. This can be fixed by me putting the time to add the get_code method.

Second, test_pkg fails because it is an old-style test and thus compares stdout output. The deal is that it executes 'dir' on some modules which now have __loader__ defined, so it craps out. Obviously not a failure on my part.

Third, the stack level arguments for warnings.warn and friends is now off compared to what people used to expect for module deprecation. The original import is written in C, and thus does not show up in the call chain. This means that if you deprecate a module and have a stack level of 2 then the warning is attributed to the caller's module. But with the import code being in Python, that just leads into import code instead of the caller. This is only an issue for module-level warnings during an import so it is not catestrophic of a problem. But still, this is unfortunate.

Lastly, PEP 235 has not been implemented. This PEP specifies how to handle case-sensitivity on case-preserving but case-insensitive filesystems. I don't know how to check the exact casing of file paths (I use os.path.exists for everything and I am willing to bet it ignores case on case-insensitive filesystems) so I have not done this yet. Anyone know how to do this without exposing C-level code?

But that's it! Only the last two are real issues and really only the 'warnings' problem poses a possible issue that can't be resolved in a backwards-compatible way. I am going to take a pass over the docs and code before I officially announce to python-dev sometime next week.

If you look at the Google doc that I have been maintaining you will see a list of bonus work that I would like to do which includes benchmarking. There is also rewriting zipimport and reload in Python along with implementing a sqlite3 import hook (Paul Moore sent me some initial code but I have not looked at it).

I suspect people want benchmarking numbers, so I will probably do that first. It would also help for profiling if this gets made the default import implementation.

So an 813 line file along with double that amout for test code gets us a new implementation of import. Feels nice to have pulled this off. I really hope that I can at some point work out the bootstrapping issues and make this the built-in import implementation. That would just make even more proud of this chunk of code.