2007-05-02

__import_ex__

I thought some more today about how I would replace __import__ and tweak the IMPORT_NAME and IMPORT_FROM bytecodes. I am not going to be writing a PEP for any of this for a while, but I figured I might as well write out what I am thinking in case anyone else is interested in this topic.

First, the signature for __import__ needs a major overhaul. Since tweaking the signature is out of the question because of backwards-compatibility I figured I could introduce a new function, __import_ex__(name:Sequence[str], caller__name__:str, caller__path__:(Sequence[str]|None)=None) -> object. Here is an explanation of the parameters:
  • name : A list of strings representing the parts of the module to import. Empty strings at the front of the list represent a dot in the name. It is a list of strings instead of just a string so as to have the bytecode handle the string splitting, making the function have one less thing to deal with. Plus if you are programmatically creating an import there is just as much of a chance (if not more) that you are building up the import and thus probably would prefer to work with a list than do a bunch of string manipulations. But the parameter can easily be changed to be a string.
  • caller__name__ : The name of the module requesting the import.
  • caller__path__ : The value of __path__ if the caller has it defined, else None. This could be a boolean instead of the actual value stored in __path__. The reason for not doing that is that import will end up needing the value from __path__ anyway, so might as well just grab it now instead of having to fetch the caller from sys.modules and getting the path value then. Plus it helps facilitate testing by having less things happen behind the scenes.
Before I discuss the return value, one must look at how the current bytecode plays into how __import__ works. When you do a named import like ``import spam.bacon.python`` the bytecode used is IMPORT_NAME. This leads to 'spam' being on the stack. The module is then stored into the namespace as 'spam'. This allows normal resolution of name lookup on the objects to work for 'spam.bacon.python'.

But it's different for ``from spam.bacon import python``. That uses two bytecodes: IMPORT_NAME and IMPORT_FROM. The first bytecode gets 'spam.bacon' on to the stack. The second gets 'python' off of 'spam.bacon' and puts that on the stack to be bound to the name 'python'.

As you may have noticed, IMPORT_NAME puts different things on the stack based on what type of import statement is called. To me that is kind of nasty. Why can't you just return the root module of what is imported? Well, think of a relative import like '...blah' that resolves to 'foo.bar.blah'. If you just put on the root module then you have 'foo' on the stack. But there is no way to introspect that you are going to what stuff of of the 'foo.bar.blah' module.

What to do? Well, __import_ex__ could return a tuple of the module and the absolute name of what was requested, but that seems wasteful. One could add a flag to __import_ex__ to signify that leaf module is to be returned instead of the root, much like how the presense of fromlist in __import__ does now. That would let it work much like it does now.

Or __import_ex__ could always return the leaf module. Then a new bytecode could be introduced to return the root module for a leaf module that was introduced. This is as easy as ``sys.modules[mod.__name__.partition('.')[0]]``.

Either of the last two options work. Either a flag goes on to __import_ex__ to flag exactly what module is to be returned, or we end up with more fine-grained bytecode. I personally vote for the latter. You would end up with bytecode like this:
  • IMPORT_NAME : Pop the name of the module to import off the stack and push that module.
  • IMPORT_ROOT : Pop a module off of the stack and push the root module.
  • IMPORT_FROM : Pop a module and an list of items to get from that module and push those items requested. If the specified item is not on the object, try to do an import as needed to get it.
As I said, I am not about to write a PEP on any of this for some time and thus have plenty of time to mull over the options in my head for a while.