While the solution varies, the general agreed upon solution is to call some custom function that will return a module proxy. That proxy triggers the actual import once the module is actually used in some meaningful way, such as when an attempt is made to access an attribute. I also assumed that people either used a custom importer (meta path or not) or a custom
__import__function. But beyond that I had no clue how people solved the issue. The importer solution can be seen in a cookbook recipe and the custom function can be seen in peak.util.import (source).
Going in blind on this, I started to think about what it would take to pull this off using importlib in Python 3.1 for Python source. I was not going to worry about built-in modules and the like as importing those types of modules are much cheaper compared to importing Python source or bytecode. And I wanted to solution to be as simple as possible since I was doing this for fun, not as an actual robust solution to a problem I had.
With those goals in mind, I realized that I needed to come up with a lazy loader that could be subclassed for ease of use. I also knew I needed to come up with module proxy object that would trigger an import the instant an attempt was made to access an attribute. Seems simple enough, but obviously it isn't obvious.
First problem I had to solve was how the module proxy would get at the real loader when it was eventually needed. That was quickly solved by assigning the lazy loader to
__loader__and treating the class as a mix-in. Assuming the lazy loader class name was
LazyMixinand I set up the inheritance correctly, I could get at the real loader from within the module proxy with
super(LazyMixin, self.__loader__). By re-assigning the resulting object to
__loader__I would continue to have the actual loader doing the loading on the module as well have a consistent way to get at the loader I was after. Assuming a module proxy class named LazyModule, the code looks like this:
def load_module(self, name):
module = LazyModule(name)
module.__loader__ = self
sys.modules[name] = module
With that problem solved I turned my attention to the module proxy. After some mishaps with trying to dynamically set
__getattr__I had an epiphany: I could simply reset
__class__to a class that lacked the
__getattribute__. By having both the lazy module class with
__getattribute__defined and the module class that
__class__gets set to inherit from
types.ModuleTypeI could make sure that the module instance was always consistent in terms of its type. This all led to the following module objects:
def __getattribute__(self, attr):
self.__class__ = Module
self.__loader__ = super(LazyMixin, self.__loader__)
return getattr(self, attr)
The final problem to work out was how to make sure that the module instance was what got initialized by the import so that all pre-existing references to the module instance worked after the import. The obvious way is to make the module proxy act as an actual proxy; you could store the initialized module in the proxy and forward all attribute requests to the initialized module using
__getattribute__. But that has a performance penalty that seems unnecessary and I had a gut feeling there had to be a more elegant solution.
That more elegant solution is module reloading. If you look at PEP 302 you will notice that to support module reloading that a loader is expected to reuse the module found in
sys.moduleswhen performing a load. This makes sure that existing references to a module continue to work as the module instance doesn't change, just its
__dict__. So by sticking the module proxy instance into
sys.modulesthe real loader would use that module object for the initialization, letting all pre-existing references to continue to be valid.
It turns out that all of this was enough for a proof-of-concept to work! By creating a class that inherited both the lazy loader and importlib's private Python source file loader I was able to defer actual importing of a source file until an attribute on the module was accessed. And the best part was that the only requirement of the real loader was that it follow the rules set out in PEP 302 which importlib provides with
importlib.util.module_for_loader. And an even cooler thing is that since the lazy loader is a mix-in that only overrides
load_moduleall other methods from the real loader are accessible, meaning the lazy loader allows for the use of introspection APIs such as
importlib.abc.InspectLoaderto continue to work! And the greatest perk of all is that since it works with loaders it is completely transparent to the import statement, removing any need to use a custom function to make this work!
To see the proof-of-concept which is NOT PRODUCTION QUALITY -- remember it is using private APIs from importlib which could easily disappear at any time -- see this paste bin. I am sure the code should be more robust somehow, but I am rather pleased with the solution is only 20 source lines and appears to work at least in a really dead-simple example. If people actually like this approach and think they would find it useful then leave a comment and I will see if I can be persuaded to package it up and write the proper tests so that I would be willing to put this up on the Cheeseshop.